Packages

class StreamingHistogram extends MutableHistogram[Double]

Ben-Haim, Yael, and Elad Tom-Tov. "A streaming parallel decision tree algorithm." The Journal of Machine Learning Research 11 (2010): 849-872.

NOTE: The order in which values are counted could affect Bucket distribution and counts as StreamingHistogram instances are merged, due to the way in which bucket boundaries are defined.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. StreamingHistogram
  2. MutableHistogram
  3. Histogram
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new StreamingHistogram(size: Int, minimum: Double = Double.PositiveInfinity, maximum: Double = Double.NegativeInfinity)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. def +(other: StreamingHistogram): StreamingHistogram

    Create a new histogram from this one and another without altering either.

  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. def areaUnderCurve(): Double

    Return the area under the curve.

  6. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  7. def binCounts(): Seq[(Double, Long)]

    Return sequence of tuples pairing bin label value and to its associated count.

    Return sequence of tuples pairing bin label value and to its associated count.

    Definition Classes
    Histogram
  8. def bucketCount(): Int

    The number of buckets utilized by this Histogram.

    The number of buckets utilized by this Histogram.

    Definition Classes
    StreamingHistogramHistogram
  9. def buckets(): List[Bucket]

    Return the list of buckets of this histogram.

    Return the list of buckets of this histogram. Primarily useful for debugging and serialization.

  10. def cdf(): Array[(Double, Double)]

    Return an array of x, cdf(x) pairs

    Return an array of x, cdf(x) pairs

    Definition Classes
    StreamingHistogramHistogram
  11. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  12. def countItem(item: Double, count: Long): Unit

    Note the occurance of 'item'.

    Note the occurance of 'item'.

    The optional parameter 'count' allows histograms to be built more efficiently. Negative counts can be used to remove a particular number of occurances of 'item'.

    Definition Classes
    StreamingHistogramMutableHistogram
  13. def countItem(item: Double): Unit

    Note the occurance of 'item'.

    Note the occurance of 'item'.

    Definition Classes
    MutableHistogram
  14. def countItemInt(item: Int, count: Long): Unit

    Note the occurance of 'item'.

    Note the occurance of 'item'.

    The optional parameter 'count' allows histograms to be built more efficiently. Negative counts can be used to remove a particular number of occurances of 'item'.

    Definition Classes
    StreamingHistogramMutableHistogram
  15. def countItemInt(item: Int): Unit

    Note the occurance of 'item'.

    Note the occurance of 'item'.

    Definition Classes
    MutableHistogram
  16. def countItems(items: Seq[Double])(implicit dummy: DummyImplicit): Unit

    Note the occurances of 'items'.

  17. def countItems(items: Seq[Bucket]): Unit

    Note the occurances of 'items'.

  18. def deltas(): List[Delta]

    Return the list of deltas of this histogram.

    Return the list of deltas of this histogram. Primarily useful for debugging.

  19. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  20. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  21. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  22. def foreach(f: (Double, Long) ⇒ Unit): Unit

    Execute the given function on each bucket.

    Execute the given function on each bucket. The value contained by the bucket is a Double, and the count is an integer (ergo the signature of the function 'f').

    Definition Classes
    StreamingHistogramHistogram
  23. def foreachValue(f: (Double) ⇒ Unit): Unit

    Execute the given function on each bucket label.

    Execute the given function on each bucket label.

    f

    A unit function of one parameter

    Definition Classes
    StreamingHistogramHistogram
  24. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  25. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  26. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  27. def itemCount(item: Double): Long

    Get the (approximate) number of occurrences of an item.

    Get the (approximate) number of occurrences of an item.

    Definition Classes
    StreamingHistogramHistogram
  28. def maxBucketCount(): Int

    Return the maximum number of buckets of this histogram.

    Return the maximum number of buckets of this histogram.

    Definition Classes
    StreamingHistogramHistogram
  29. def maxValue(): Option[Double]

    Gets the maximum value this histogram has seen

    Gets the maximum value this histogram has seen

    Definition Classes
    StreamingHistogramHistogram
  30. def mean(): Option[Double]

    Return the approximate mean of the histogram.

    Return the approximate mean of the histogram.

    Definition Classes
    StreamingHistogramHistogram
  31. def median(): Option[Double]

    Return the approximate median of the histogram.

    Return the approximate median of the histogram.

    Definition Classes
    StreamingHistogramHistogram
  32. def merge(histogram: Histogram[Double]): StreamingHistogram

    Return the sum of this histogram and the given one (the sum is the histogram that would result from seeing all of the values seen by the two antecedent histograms).

    Return the sum of this histogram and the given one (the sum is the histogram that would result from seeing all of the values seen by the two antecedent histograms).

    Definition Classes
    StreamingHistogramHistogram
  33. def minMaxValues(): Option[(Double, Double)]

    Return the smallest and largest items seen as a tuple.

    Return the smallest and largest items seen as a tuple.

    Definition Classes
    Histogram
  34. def minValue(): Option[Double]

    Get the minimum value this histogram has seen.

    Get the minimum value this histogram has seen.

    Definition Classes
    StreamingHistogramHistogram
  35. def mode(): Option[Double]

    Return the approximate mode of the distribution.

    Return the approximate mode of the distribution. This is done by simply returning the label of most populous bucket (so this answer could be really bad).

    Definition Classes
    StreamingHistogramHistogram
  36. def mutable(): StreamingHistogram

    Return a mutable copy of this histogram.

    Return a mutable copy of this histogram.

    Definition Classes
    StreamingHistogramHistogram
  37. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  38. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  39. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  40. def percentile(q: Double): Double
  41. def percentileBreaks(qs: Seq[Double]): Seq[Double]

    For each q in qs, all between 0 and 1, find a number (approximately) at the qth percentile.

    For each q in qs, all between 0 and 1, find a number (approximately) at the qth percentile.

    qs

    A list of quantiles (0.01 == 1th pctile, 0.2 == 20th pctile) to use in generating breaks

    Note

    Our aim here is to produce values corresponding to the qs which stretch from minValue to maxValue, interpolating based on observed bins along the way

  42. def percentileRanking(item: Double): Double

    Get the (approximate) percentile of this item.

  43. def quantileBreaks(num: Int): Array[Double]

    This method returns the (approximate) quantile breaks of the distribution of points that the histogram has seen so far.

    This method returns the (approximate) quantile breaks of the distribution of points that the histogram has seen so far. It is guaranteed that no value in the returned array will be outside the minimum-maximum range of values seen.

    num

    The number of breaks desired

    Definition Classes
    StreamingHistogramMutableHistogramHistogram
  44. def rawValues(): Array[Double]

    Return an array containing the values seen by this histogram.

    Return an array containing the values seen by this histogram.

    Definition Classes
    StreamingHistogramHistogram
  45. def setItem(item: Double, count: Long): Unit

    Make a change to the distribution to approximate changing the value of a particular item.

    Make a change to the distribution to approximate changing the value of a particular item.

    Definition Classes
    StreamingHistogramMutableHistogram
    Note

    _min and _max, the minimum and maximum values seen by the histogram, are not changed by this.

  46. def statistics(): Option[Statistics[Double]]

    Generate Statistics.

    Generate Statistics.

    Definition Classes
    StreamingHistogramHistogram
  47. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  48. def toString(): String
    Definition Classes
    AnyRef → Any
  49. def totalCount(): Long

    Total number of samples used to build this histogram.

    Total number of samples used to build this histogram.

    Definition Classes
    StreamingHistogramHistogram
  50. def uncountItem(item: Double): Unit

    Uncount item.

    Uncount item.

    Definition Classes
    StreamingHistogramMutableHistogram
    Note

    _min and _max, the minimum and maximum values seen by the histogram, are not changed by this.

  51. def update(other: Histogram[Double]): Unit

    Update this histogram with the entries from another.

    Update this histogram with the entries from another.

    Definition Classes
    StreamingHistogramMutableHistogram
  52. def values(): Array[Double]

    Return an array of bucket values.

    Return an array of bucket values.

    Definition Classes
    StreamingHistogramHistogram
  53. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  54. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  55. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from MutableHistogram[Double]

Inherited from Histogram[Double]

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped