class StreamingHistogram extends MutableHistogram[Double]
Ben-Haim, Yael, and Elad Tom-Tov. "A streaming parallel decision tree algorithm." The Journal of Machine Learning Research 11 (2010): 849-872.
NOTE: The order in which values are counted could affect Bucket distribution and counts as StreamingHistogram instances are merged, due to the way in which bucket boundaries are defined.
- Alphabetic
- By Inheritance
- StreamingHistogram
- MutableHistogram
- Histogram
- Serializable
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
def
+(other: StreamingHistogram): StreamingHistogram
Create a new histogram from this one and another without altering either.
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
areaUnderCurve(): Double
Return the area under the curve.
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
binCounts(): Seq[(Double, Long)]
Return sequence of tuples pairing bin label value and to its associated count.
Return sequence of tuples pairing bin label value and to its associated count.
- Definition Classes
- Histogram
-
def
bucketCount(): Int
The number of buckets utilized by this Histogram.
The number of buckets utilized by this Histogram.
- Definition Classes
- StreamingHistogram → Histogram
-
def
buckets(): List[Bucket]
Return the list of buckets of this histogram.
Return the list of buckets of this histogram. Primarily useful for debugging and serialization.
-
def
cdf(): Array[(Double, Double)]
Return an array of x, cdf(x) pairs
Return an array of x, cdf(x) pairs
- Definition Classes
- StreamingHistogram → Histogram
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
countItem(item: Double, count: Long): Unit
Note the occurance of 'item'.
Note the occurance of 'item'.
The optional parameter 'count' allows histograms to be built more efficiently. Negative counts can be used to remove a particular number of occurances of 'item'.
- Definition Classes
- StreamingHistogram → MutableHistogram
-
def
countItem(item: Double): Unit
Note the occurance of 'item'.
Note the occurance of 'item'.
- Definition Classes
- MutableHistogram
-
def
countItemInt(item: Int, count: Long): Unit
Note the occurance of 'item'.
Note the occurance of 'item'.
The optional parameter 'count' allows histograms to be built more efficiently. Negative counts can be used to remove a particular number of occurances of 'item'.
- Definition Classes
- StreamingHistogram → MutableHistogram
-
def
countItemInt(item: Int): Unit
Note the occurance of 'item'.
Note the occurance of 'item'.
- Definition Classes
- MutableHistogram
-
def
countItems(items: Seq[Double])(implicit dummy: DummyImplicit): Unit
Note the occurances of 'items'.
-
def
countItems(items: Seq[Bucket]): Unit
Note the occurances of 'items'.
-
def
deltas(): List[Delta]
Return the list of deltas of this histogram.
Return the list of deltas of this histogram. Primarily useful for debugging.
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
foreach(f: (Double, Long) ⇒ Unit): Unit
Execute the given function on each bucket.
Execute the given function on each bucket. The value contained by the bucket is a Double, and the count is an integer (ergo the signature of the function 'f').
- Definition Classes
- StreamingHistogram → Histogram
-
def
foreachValue(f: (Double) ⇒ Unit): Unit
Execute the given function on each bucket label.
Execute the given function on each bucket label.
- f
A unit function of one parameter
- Definition Classes
- StreamingHistogram → Histogram
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
itemCount(item: Double): Long
Get the (approximate) number of occurrences of an item.
Get the (approximate) number of occurrences of an item.
- Definition Classes
- StreamingHistogram → Histogram
-
def
maxBucketCount(): Int
Return the maximum number of buckets of this histogram.
Return the maximum number of buckets of this histogram.
- Definition Classes
- StreamingHistogram → Histogram
-
def
maxValue(): Option[Double]
Gets the maximum value this histogram has seen
Gets the maximum value this histogram has seen
- Definition Classes
- StreamingHistogram → Histogram
-
def
mean(): Option[Double]
Return the approximate mean of the histogram.
Return the approximate mean of the histogram.
- Definition Classes
- StreamingHistogram → Histogram
-
def
median(): Option[Double]
Return the approximate median of the histogram.
Return the approximate median of the histogram.
- Definition Classes
- StreamingHistogram → Histogram
-
def
merge(histogram: Histogram[Double]): StreamingHistogram
Return the sum of this histogram and the given one (the sum is the histogram that would result from seeing all of the values seen by the two antecedent histograms).
Return the sum of this histogram and the given one (the sum is the histogram that would result from seeing all of the values seen by the two antecedent histograms).
- Definition Classes
- StreamingHistogram → Histogram
-
def
minMaxValues(): Option[(Double, Double)]
Return the smallest and largest items seen as a tuple.
Return the smallest and largest items seen as a tuple.
- Definition Classes
- Histogram
-
def
minValue(): Option[Double]
Get the minimum value this histogram has seen.
Get the minimum value this histogram has seen.
- Definition Classes
- StreamingHistogram → Histogram
-
def
mode(): Option[Double]
Return the approximate mode of the distribution.
Return the approximate mode of the distribution. This is done by simply returning the label of most populous bucket (so this answer could be really bad).
- Definition Classes
- StreamingHistogram → Histogram
-
def
mutable(): StreamingHistogram
Return a mutable copy of this histogram.
Return a mutable copy of this histogram.
- Definition Classes
- StreamingHistogram → Histogram
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- def percentile(q: Double): Double
-
def
percentileBreaks(qs: Seq[Double]): Seq[Double]
For each q in qs, all between 0 and 1, find a number (approximately) at the qth percentile.
For each q in qs, all between 0 and 1, find a number (approximately) at the qth percentile.
- qs
A list of quantiles (0.01 == 1th pctile, 0.2 == 20th pctile) to use in generating breaks
- Note
Our aim here is to produce values corresponding to the qs which stretch from minValue to maxValue, interpolating based on observed bins along the way
-
def
percentileRanking(item: Double): Double
Get the (approximate) percentile of this item.
-
def
quantileBreaks(num: Int): Array[Double]
This method returns the (approximate) quantile breaks of the distribution of points that the histogram has seen so far.
This method returns the (approximate) quantile breaks of the distribution of points that the histogram has seen so far. It is guaranteed that no value in the returned array will be outside the minimum-maximum range of values seen.
- num
The number of breaks desired
- Definition Classes
- StreamingHistogram → MutableHistogram → Histogram
-
def
rawValues(): Array[Double]
Return an array containing the values seen by this histogram.
Return an array containing the values seen by this histogram.
- Definition Classes
- StreamingHistogram → Histogram
-
def
setItem(item: Double, count: Long): Unit
Make a change to the distribution to approximate changing the value of a particular item.
Make a change to the distribution to approximate changing the value of a particular item.
- Definition Classes
- StreamingHistogram → MutableHistogram
- Note
_min and _max, the minimum and maximum values seen by the histogram, are not changed by this.
-
def
statistics(): Option[Statistics[Double]]
Generate Statistics.
Generate Statistics.
- Definition Classes
- StreamingHistogram → Histogram
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
totalCount(): Long
Total number of samples used to build this histogram.
Total number of samples used to build this histogram.
- Definition Classes
- StreamingHistogram → Histogram
-
def
uncountItem(item: Double): Unit
Uncount item.
Uncount item.
- Definition Classes
- StreamingHistogram → MutableHistogram
- Note
_min and _max, the minimum and maximum values seen by the histogram, are not changed by this.
-
def
update(other: Histogram[Double]): Unit
Update this histogram with the entries from another.
Update this histogram with the entries from another.
- Definition Classes
- StreamingHistogram → MutableHistogram
-
def
values(): Array[Double]
Return an array of bucket values.
Return an array of bucket values.
- Definition Classes
- StreamingHistogram → Histogram
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()