Skip to content

River - a library for data stream mining

Dr Jacob Montiel

  • Incremental learning - All the tools in river can be updated with a single sample at a time.
  • Adaptive learning - Adaptive methods are specifically designed to be robust against concept drift in dynamic environments.
  • General-purpose - River caters for different machine learning problems, including regression, classification, unsupervised learning, and ad-hoc tasks.
  • Efficient - By design, streaming techniques efficiently handle resources such as memory and processing time, given the unbounded nature of data streams.
  • Easy to use - River is intended for users with any experience level. As a machine learning package, it caters for practitioners as well as researchers.
  • Expandable - River is a constantly evolving resource with new and updated tools providing additional, or improved, capabilities.

  • Topics

    • From batch to stream learning.
    • Evaluating model accuracy.
    • Process training sample points one at a time.
    • Python programming.
    • Stream processing
      • Basic concepts.
      • Data pre-processing.
    • Sample problem - NOAA weather data ('NEWWeather' dataset)
      • Decision Trees.
      • Pipelines (chaining sequences of operations).
      • Visualising operations.
      • Concept drift.
  • Additional Resources