OxTalks will soon be transitioning to Oxford Events (full details are available on the Staff Gateway). A two-week publishing freeze is expected to start before the end of Hilary Term to allow all future events to be migrated to the new platform. During this period, you will not be able to submit or edit events on OxTalks. The exact freeze dates will be confirmed on the Staff Gateway and via email to identified OxTalks users.
If you have any questions, please contact halo@digital.ox.ac.uk
In this tutorial, we will complete a small end-to-end Machine Learning project using scikit-learn (scikit-learn.org), comprehensive, but simple and one of the most useful Machine Learning libraries for Python.
On a small dataset we will go through the typical pipeline of a real Machine Learning project: start with statistical summaries and visualization of the data, build multiple different machine learning models, use cross-validation to estimate their accuracies, select the best algorithm, make and evaluate the predictions on a validation set.
At the end of the session, we might have a look at the other useful functions integrated into scikit-learn.
The following tools will be used in this code clinic:
Python3 – www.python.org
Python SciPy libraries: – scipy – numpy – matplotlib – pandas – sklearn (shorten from scikit-learn)
You should stick to your favourite Python IDE; I will be working in Spyder – www.spyder-ide.org, which I highly recommend as IDE for R-users, who starts with Python and moves from R-Studio.