Kaggle has been quite a popular platform to showcase your skills and submit your algorithms in the form of kernels. This was more than enough for Google to understand its further potential and purchase it in 2017 with a goal of awarding data scientists or data analysts with cash prizes and medals to encourage others to participate and code. For many, it was a breath of fresh air and for others, it was an opportunity to optimize their kernels as resubmission was allowed. A lot of companies are organizing hackathons on Kaggle to find the perfect candidate for various data science openings.
Well if you are new to Kaggle, then this is the ideal guide for you to get started with its kernels and other aspects. Let’s tackle this step by step.
Visit kaggle.com and Register With Your Credentials
Kaggle is a huge repository of kernels. It is also an online community of users with ideas. Just go to kaggle.com and register as a new user by logging in with your Google, Yahoo or Facebook credentials. Or else you can create a new account by submitting your email id and a new password which should lead you to its home page.
After Signing in You Will Be Greeted With the Homepage
On top, there are various options like competitions, datasets, kernels, discussion, etc. Our primary focus is on kernels. Below on the left, you will find a Facebook-like feed that gets updated depending on your interests. On the right, you will find a card which summarises your profile on Kaggle and where you stand. Below that card, you can find another card which describes all the competitions that have you participated in.
Time to Dig in
After getting a short glimpse of Kaggle and its features, let’s dive into the Kernels options and see what we have here. This is where all the magic happens! It will show all the relevant kernels to the user based on the Kaggle algorithm which is sorted based on their trend-worthiness. We can sort kernels from the filter options which include Most Votes, Most Comments, Recently Created, Recently Run and Relevance. There are many kernels for Data Science ranging from XGBoost to House Price Prediction using Linear Regression, Pytorch, TensorFlow and many more. These might include a set of instructions that guide you on how to implement those kernels and others include just the code to derive an output.
After selecting any one of the competitions, we can add a new kernel by selecting the new kernel option on the right corner just below the name.
This will ask you if you want to create a Script or a NoteBook.
The script is like a coding platform where you can directly start coding on either R or Python. It is ideal for fitting machine learning models and direct competition submissions. It looks like this.
The advantage of a major IT giant buying your company is the extra processing power! This kernel provides you with 16GB RAM along with GPU capabilities if necessary. It also provides you with the basic and mandatory packages required in Python such as numpy and pandas on its own! This not only helps the existing user but also provides new users with the right platform to showcase their skills.
Notebook resembles the Jupyter platform where the user has the capability of sharing his insights and findings along with the code. This can be done either in R or Python.
The interface of NoteBook looks like this. It is somewhat similar to a script. All the same functions and capabilities are available in this also. Kernels also have the capability of creating great-looking dashboards in RMarkdown.
We also have the freedom to set out kernels private or public. A public kernel is easily available and noticeable to everyone. Whereas, the private kernel is visible only to the owner of the kernel and those it has been shared with. This is particularly useful when the owner of a kernel wants to post his XGBoost code in Python but doesn’t want others who are competing along to see the code.
The kernels have been designed keeping in mind the flexibility and usability of the current users in the field of machine learning, deep learning, and neural networks. They require tons of libraries in order to perform the preferred algorithm and in order to do that, we need enough processing power. This coding platform not only helps the user with the processing but also suggests any new library of function that needs to be created or declared.
What Makes Kaggle Different
A Self-motivating Community
Just because the competition doesn’t carry an expensive reward, doesn’t mean it has to be ignored. Kaggling is all about learning new things every day and implementing them somewhere which can create a positive impact, and that’s the message this community tries to convey.
The 3 P’s
Practice, patience, perfection: this is the underlying mantra of Kaggle. Data Science is a field where practicing is much more important than luck. More kernels mean more practice which means you have a better shot at cracking that dream interview! Coupled with patience, it can help you grasp and understand the basic working of an algorithm without actually mugging it up!
Every competition has its own discussion panel which includes practitioners who have been working in this field for 5-10 years. They share their expertise and opinions which can enlighten our curious minds and make us follow things which can be quite crucial for our further understanding of this field.
Kaggle has been incremental enough in making many realize the importance of data science in today’s fast-growing IT sector. From being a data analyst who is working on SQL and Excel to becoming a data scientist who deals with machine learning models and story-telling on a daily basis, Kaggle can definitely help the ones who are ambitious and want to make a mark in the field of data science. Just search for an easy kernel which includes tons of exploratory data analysis and a machine learning model fitting and validation. This will help you start your journey and follow the above steps to keep going and try to understand the logic behind each algorithm. All the very best!