ProductPromotion
Logo

Python.py

made by https://0x3d.site

GitHub - josephreisinger/vowpal_porpoise: lightweight python wrapper for vowpal wabbit
lightweight python wrapper for vowpal wabbit. Contribute to josephreisinger/vowpal_porpoise development by creating an account on GitHub.
Visit Site

GitHub - josephreisinger/vowpal_porpoise: lightweight python wrapper for vowpal wabbit

GitHub - josephreisinger/vowpal_porpoise: lightweight python wrapper for vowpal wabbit

vowpal_porpoise

Lightweight python wrapper for vowpal_wabbit.

Why: Scalable, blazingly fast machine learning.

Install

  1. Install vowpal_wabbit. Clone and run make
  2. Install cython. pip install cython
  3. Clone vowpal_porpoise
  4. Run: python setup.py install to install.

Now can you do: import vowpal_porpoise from python.

Examples

Standard Interface

Linear regression with l1 penalty:

from vowpal_porpoise import VW

# Initialize the model
vw = VW(moniker='test',    # a name for the model
        passes=10,         # vw arg: passes
        loss='quadratic',  # vw arg: loss
        learning_rate=10,  # vw arg: learning_rate
        l1=0.01)           # vw arg: l1

# Inside the with training() block a vw process will be 
# open to communication
with vw.training():
    for instance in ['1 |big red square',\
                      '0 |small blue circle']:
        vw.push_instance(instance)

    # here stdin will close
# here the vw process will have finished

# Inside the with predicting() block we can stream instances and 
# acquire their labels
with vw.predicting():
    for instance in ['1 |large burnt sienna rhombus',\
                      '0 |little teal oval']:
        vw.push_instance(instance)

# Read the predictions like this:
predictions = list(vw.read_predictions_())

L-BFGS with a rank-5 approximation:

from vowpal_porpoise import VW

# Initialize the model
vw = VW(moniker='test_lbfgs', # a name for the model
        passes=10,            # vw arg: passes
        lbfgs=True,           # turn on lbfgs
        mem=5)                # lbfgs rank

Latent Dirichlet Allocation with 100 topics:

from vowpal_porpoise import VW

# Initialize the model
vw = VW(moniker='test_lda',  # a name for the model
        passes=10,           # vw arg: passes
        lda=100,             # turn on lda
        minibatch=100)       # set the minibatch size

Scikit-learn Interface

vowpal_porpoise also ships with an interface into scikit-learn, which allows awesome experiment-level stuff like cross-validation:

from sklearn.cross_validation import StratifiedKFold
from sklearn.grid_search import GridSearchCV
from vowpal_porpoise.sklearn import VW_Classifier

GridSearchCV(
        VW_Classifier(loss='logistic', moniker='example_sklearn',
                      passes=10, silent=True, learning_rate=10),
        param_grid=parameters,
        score_func=f1_score,
        cv=StratifiedKFold(y_train),
).fit(X_train, y_train)

Check out example_sklearn.py for more details

Library Interace (DISABLED as of 2013-08-12)

Via the VW interface:

with vw.predicting_library():
    for instance in ['1 |large burnt sienna rhombus', \
                      '1 |little teal oval']:
        prediction = vw.push_instance(instance)

Now the predictions are returned directly to the parent process, rather than having to read from disk. See examples/example1.py for more details.

Alternatively you can use the raw library interface:

import vw_c
vw = vw_c.VW("--loss=quadratic --l1=0.01 -f model")
vw.learn("1 |this is a positive example")
vw.learn("0 |this is a negative example")
vw.finish()

Currently does not support passes due to some limitations in the underlying vw C code.

Need more examples?

  • example1.py: SimpleModel class wrapper around VP (both standard and library flavors)
  • example_library.py: Demonstrates the low-level vw library wrapper, classifying lines of alice in wonderland vs through the looking glass.

Why

vowpal_wabbit is insanely fast and scalable. vowpal_porpoise is slower, but only during the initial training pass. Once the data has been properly cached it will idle while vowpal_wabbit does all the heavy lifting. Furthermore, vowpal_porpoise was designed to be lightweight and not to get in the way of vowpal_wabbit's scalability, e.g. it allows distributed learning via --nodes and does not require data to be batched in memory. In our research work we use vowpal_porpoise on an 80-node cluster running over multiple terabytes of data.

The main benefit of vowpal_porpoise is allowing rapid prototyping of new models and feature extractors. We found that we had been doing this in an ad-hoc way using python scripts to shuffle around massive gzipped text files, so we just closed the loop and made vowpal_wabbit a python library.

How it works

Wraps the vw binary in a subprocess and uses stdin to push data, temporary files to pull predictions. Why not use the prediction labels vw provides on stdout? It turns out that the python GIL basically makes streamining in and out of a process (even asynchronously) painfully difficult. If you know of a clever way to get around this, please email me. In other languages (e.g. in a forthcoming scala wrapper) this is not an issue.

Alternatively, you can use a pure api call (vw_c, wrapping libvw) for prediction.

Contact

Joseph Reisinger @josephreisinger

Contributors

License

Apache 2.0

Articles
to learn more about the python concepts.

Resources
which are currently available to browse on.

mail [email protected] to add your project or resources here 🔥.

FAQ's
to know more about the topic.

mail [email protected] to add your project or resources here 🔥.

Queries
or most google FAQ's about Python.

mail [email protected] to add more queries here 🔍.

More Sites
to check out once you're finished browsing here.

0x3d
https://www.0x3d.site/
0x3d is designed for aggregating information.
NodeJS
https://nodejs.0x3d.site/
NodeJS Online Directory
Cross Platform
https://cross-platform.0x3d.site/
Cross Platform Online Directory
Open Source
https://open-source.0x3d.site/
Open Source Online Directory
Analytics
https://analytics.0x3d.site/
Analytics Online Directory
JavaScript
https://javascript.0x3d.site/
JavaScript Online Directory
GoLang
https://golang.0x3d.site/
GoLang Online Directory
Python
https://python.0x3d.site/
Python Online Directory
Swift
https://swift.0x3d.site/
Swift Online Directory
Rust
https://rust.0x3d.site/
Rust Online Directory
Scala
https://scala.0x3d.site/
Scala Online Directory
Ruby
https://ruby.0x3d.site/
Ruby Online Directory
Clojure
https://clojure.0x3d.site/
Clojure Online Directory
Elixir
https://elixir.0x3d.site/
Elixir Online Directory
Elm
https://elm.0x3d.site/
Elm Online Directory
Lua
https://lua.0x3d.site/
Lua Online Directory
C Programming
https://c-programming.0x3d.site/
C Programming Online Directory
C++ Programming
https://cpp-programming.0x3d.site/
C++ Programming Online Directory
R Programming
https://r-programming.0x3d.site/
R Programming Online Directory
Perl
https://perl.0x3d.site/
Perl Online Directory
Java
https://java.0x3d.site/
Java Online Directory
Kotlin
https://kotlin.0x3d.site/
Kotlin Online Directory
PHP
https://php.0x3d.site/
PHP Online Directory
React JS
https://react.0x3d.site/
React JS Online Directory
Angular
https://angular.0x3d.site/
Angular JS Online Directory