Analytics is now increasingly being used to guide business decisions in a variety of domains. Businesses these days have more data than ever on their consumers and the market. As a result it has become important for businesses to leverage that data to assisst in their decision making.

Marketing is one domain where data analysis is being used to provide actionable insights to marketing managers. One of the key tasks of any brand manager is to identify customer segments. A customer segment is a group of the population that has shared characteristics like age,education,preferences and attitudes. Segments can help marketing…

One of the key skills that a good data analyst needs to have is the ability to extract data from large databases. SQL is a domain-specific langauge that helps you to access and manipulate large databases.

In this article we are going to go over some of the basic commands that are important to learn to have good mastery over SQL. We are going to use a public database Sakila that lists information about movies and customers.The database has a lot of tables that are connected to each other through primary and foreign keys.The entity relationship diagram is shown below.

Attrition Analysis and Prediction in Python


In this article , we are going to conduct HR attrition analysis in Python. The dataset that we are going to use is a very popular dataset that can found on the following link:

The purpose of this exercise is to identify factors that lead to attrition and then predict employee attrition using multiple machine learning algorithms.The dataset has thirty-two independent variables that range from tenure to the distance from the workplace. A comprehensive dataset like this requires an extensive EDA process before a machine learning algorithm can be built.

The most importat…

Classification problems is one of the most common areas where machine learning algorithms are applied with excellent results. The biggest difference between a regression problem and a classification problem is that in a classification problem the target variable is categorical/binary.

In this article, we will go over the heart rate disease dataset published by UCI Machine Learning Repository where the target variable is heart disease.We will go over multiple algorithms and will also see how boosting can improve model results.

The dataset has multiple categorical and continous independent variables which can be used to predict heart disease in a patient.After…

Suhaib Ali Kamal

Passionate about data and tech Linkedin:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store