9
DATA SCIENCE E-BOOK 031104113 312832230 001020011 301001010 031010020 031104113 312832230 001020011 301001010 031010020

data science

Tags:

Embed Size (px)

DESCRIPTION

Board Infinity is a best digital marketing and data science institute in mumbai, which is a full-stack career platform for students and jobseekers enabled by personalised learning paths,career coaches and access to various job oppurtunities. We provide online and offline training in Data Science, Digital Marketing, Full stack Web Development,Product management< machine learning and Atrificial Intelligence,Online career counselling and other career solutions

Citation preview

Page 1: data science

DATA SCIENCE E-BOOK

031104113312832230001020011301001010031010020

031104113312832230001020011301001010031010020

Page 2: data science

TABLE OF CONTENT

• Career Path in Data Science: Explained

• Future of Data Science In India

• Top Reasons to Switch Careers To Data Science

• Various Job Titles

• 5 Tools Every Data Scientist Should Know About

• Soft Skills a Fresher Must Have! For Data Science

• Technical Interview Q&A for Data Scientists

...................................................

..................................................................

...................................

...........................................................................................

................................

...............................

.........................................

1-2

2-3

3

3-4

4

5-6

4-5

Want to build a career in Data Science from scratch? Book a call with expert

Page 3: data science

Data science is basically making sense of the truckloads of data available locally and globally, today. Businesses want to make intelligent use of this data that is available. The data science domain employs mathematics, statistics, computer science, and information science to analyse and understand data.

Companies who want to maximize their profits, analyze customer segments and minimize their losses are actively looking for data scientists. The various data science career paths that can be taken up are machine learning engineer, business analyst, data analyst, Systems analyst, infrastructure architect, etc.

Career Path in Data Science: Explained

Career in Data Science | Scope & Job roles

1

Introduction to Data Science

There is a lot of unstructured data available with organisations that canhelp in customising their offerings. Here is where data science comes into the picture!According to analyticsindiamag.com, the total number of analytics and data science job positions available in India in 2019 amounted to 97,000. With the advent of Data Science, industries are able to make careful data-driven decisions making Data Science the “Sexiest Job of the 21st Century”.

Why is data science needed?

In order to become a full-fledged data scientist, you must be proficient in mathematics, statistics and computer science.

By 2020, the vacancies in data science careers/jobs will touch almost two lakh, says Economic Times.

Data Science is a new and nascent field. India lags way behind in offering degrees in Data Science. As a result, there are a very few people who possess actual knowledge and the right amount of expertise in this field. This gap will soon become larger.Data science online courses can help you sharpen and hone your competencies. Hence, it is advisable to opt for a data science online course, which connects you to leading experts from the industry, and your peers for unique mentoring and networking opportunities.

How to Become a Data Scientist?

Excel is a powerful tool that can perform many functions. It is a business analyst’s best friend and you will learn why. Right from the basic functions to the more complex ones like pivot charts, VBA, etc., a real-time case study will be assigned to improve your practical and hands-on knowledge.

1.) Business Analytics with Excel

Data Science Syllabus

Page 4: data science

2

Every business needs to visualise its data to analyse it and then consequently make decisions from it. Tableau, the most popular tool will help you do just that. Learn the ins and outs of this tool to ace your data-visualising knowledge.

2.) Data Visualization and Storytelling using Tableau

Both Maths and Python are fundamentally crucial for understanding and implementing data science. Right from learning different algorithms, statistics and probability, hypothesis testing etc., to learning Pandas, Anaconda and Jupyter, you will move on from the basics to the advanced levels with a lot of clarity and ease.

3.) Maths and Python Programming

4.) Soft skills

Technical skills are not everything, and you need to be efficient at soft skills as well if you want to get selected for your dream job. Social and communication skills are very essential in order to crack interviews. Through the personalized sessions learn various tools to improve your online presence, ways to improve your interview skills and build a strong portfolio.

90% of the data generated in the world was generated in the past 2 years only.

Companies need data scientists, and they’re even willing to pay a lot of money to keep them at their company.

There are 2.5 quintillion (that's 18 zeros after 1) bytes of data created each day!

Think about that.Due to this HUGE amount of data being created on a daily basis, more and more companies are starting to get data-driven.This trend will continue and keep rising, especially due to the fact that a data scientist’s job can’t really be replaced anytime soon.

Learn all the machine learning concepts like decision tree, linear regression, logistic regression, NLP, etc to build efficient projects and enhance your knowledge through live projects too.

4.) Machine Learning

The role of a data scientist includes:

Future of Data Science in IndiaThe next step to basic SQL

( assuming you already know SQL basic). Learn more about MySQL, NoSQL, and various other queries to become a subject matter expert. This will help you to not only create data, but also retrieve it.

Advanced SQL

Collecting the huge amount of data from the internet

Clean up the irrelevant data and store the data in a database

Page 5: data science

3

Top Reasons to Switch Careers To Data Science

Various Job Titles

The Way Forward for Freshers?

Regardless of how things end up evolving in the future (we can’t really predict that yet) ... One thing is crystal clear.

Companies will be using enormous amounts of data to drive key business decisions, and skilled data scientists will be the key to unlocking endless possibilities.

Research the data and frame questions which need to be answered

Use modelling, statistics and analytics programs to organize the data into a predictive model

Analyze the data and come up with trends and opportunities and answers for issues or problems when required

Skill Gap

The first reason to become a data scientist is... While the demand for data science is sky-high, there isn’t much supply, because of the lack of the required skills.

The education system is still lacking in this sector. A lot of people who are trying to find jobs in this sector don’t have the necessary skills to find employment.

Salary

The average data scientist salary ranges between 10-12.5 Lakhs across experience levels. The annual Analytics India Salary Study also found a 1.8% increase in salaries of entry-level analytics professionals with experience between 0 to 3 years.

Increased Adoption

More and more organizations have started integrating data into their processes. They are transitioning into taking more data-driven decisions.

It enables companies to create new business opportunities, generate more revenue, predict future trends, optimize current operational efforts, and produce actionable insights. That way, you stand to grow and evolve your empire over time, making your organization more adaptable as a result.

Data science will never restrict you. It has a wide domain which gives you a lot of variety to choose from. You can choose between multiple specialties like:

Data Engineer

Data Architect

Data Analyst

Statistician

Machine Learning Engineer

Big Data Engineer

Business Analyst

MIS Reporting

1

2

3

4

5

6

7

8

Page 6: data science

1. MS Excel

Microsoft Excel is a spreadsheet application that is bundled as part ofthe MS Office suite of office productivity tools. Excel has a wide range of functionalities, from sorting and manipulating data to representing that data in the form of graphs and charts.

4. Apache Hadoop

Apache Hadoop is a free, open-source framework that can manage and store tons and tons of data. It provides distributed computing of massive data sets over a cluster of 1000s of computers. It is used for high-level computations and data processing.

2. Python

Python is a high-level, interpreted, general-purpose programming language, well suited for rapid application development. It has a simple and easy to learn syntax that allows for a steep learning curve and for reductions in the costs ofprogram maintenance. There are many reasons why it is the preferred language for data science. To mention a few: scripting potential, verbosity, portability, and performance.

5 Tools Every Data Scientist Should Know About

Soft Skills a Fresher Must Have! For Data Science

4

Over the last few years, people have moved away from the umbrella term, data scientist. Companies now advertise for a diverse set of job roles.

5. Jupyter Notebook

Be it a techie or a digital marketing enthusiast, if one is not skilled enough, the chances of landing a job are close to negligible.

Jupyter Notebook is an open-source interactive web-based computational notebook that is available for free for freelance data science professionals. It has gained popularity in recent years and has largely been adopted for the various applications it offers.

3. Tableau

Tableau is another option to create interactive dashboards from a combination of multiple data sources.It also offers a desktop version, a web version, and an online service to share the dashboards you create.

It works naturally “with the way you think” (as it claims), and it is easy to use for non-technical people, which is enhancedthrough lots of tutorials and online videos.

Listening

Critical Thinking

Collaboration

Creativity

People Management

Agility

1

2

3

4

5

6

Page 7: data science

Technical Interview Q&A for Data Scientists

Change is the new constant, and with the advancing technology and times, you need to have a do-or-die attitude. If you feel there is something you cannot do, and if you lose hope, then you will become irrelevant. You need to be agile, and ready to adapt and willing to learn anything new that is thrown at you.

Such is the world today, that we prefer to talk everything over text than make a phone call or a face-to-face meeting.

Thus, communicating with the team members and getting the work done, is a very essential soft skill to develop.

5

1. Listening

Being a good listener is very important for effective communication in the workplace. If you don’t listen effectively, then there are big chances of getting misunderstood, and chances of mistakes happening.

2. Critical Thinking6. Agility

It is the ability to think clearly and rationally about what is to be done in a problematic situation. With AI and technology coming in, no matter what the experts say, the human mind perceives many more such factors than the machine.

3. Collaboration

Also known as team work... This is the skill when developed, will help you work efficiently in a team, and will help you engage with others to create or produce something.

Employers dig for this skill in freshers. They don’t want people who work single-handedly, rather, they want people having team skills.

4. Creativity

Often, all problems need are some out-of-the-box solutions. It sometimes involves looking at familiar things with a fresh perspective. Creativity can get rid of mundaneness. Of course, you can develop and manage creativity.

5. People Management

The ability to manage a team is very essential to the success of the organization and its businessobjectives.

Importance of p-value?

After hypothesis testing is done, the significance value is computed (which is the p-value) which usually lies between 0 and 1. If the p-value is lesser than 0.05, the null hypothesis is accepted, and if greater than 0.05, then rejected. Thus, the p-value is very important in judging the probability of the hypothesis.

What is A/B testing?

This type of hypothesis testing is used to test two variables A and B, of a randomized experiment. It is used to compare two versions of a website or an app to determine which performs better.

How is Hadoop useful?

Hadoop is an open-source framework that helps in processing large amounts of

Page 8: data science

datasets across computers, using simple programming models. It provides the ability to deal with large scale unstructured data and allows to implement different algorithms on them.

6

Name some NoSQL databases

Some popular NoSQL databases are MongoDB, Cassandra, CRM, HBase, Hypertable, Redis, etc..

Name some clauses used in SQL

There are five types of clauses used in SQL, namely, Order By clause, Top clause, Where clause, Group By clause, and Having clause.

What is a foreign key?

A foreign key is used to link two tables together. They are the columns of a table that are used to point to the primary key of the other table. They act as a cross-reference between tables.

Difference between deep learning and machine learning?

Deep learning is actually a part of machine learning itself but has different capabilities. It is about developing algorithms that simulate the way humans react as per their nervous system.

How to handle missing values in data?

There are a few ways to handle this situation.

What is univariate, bivariate and multivariate analysis?

The univariate analysis involves statistical techniques that can differentiate on the number of variables involved. The bivariate analysis highlights the difference between two variables at one time.

Replace the missing values with the mean/median value of the observation

Run predictive regression models

Drop the values

Delete the observation

Clustering and finding the accurate value

1

2

3

4

5

Page 9: data science

Download Sample Resume

Need 1:1 Resume Review? Find a Coach