## Whats The Difference Between Probability And Likelihood

Probability is the measure of the likelihood that an event will occur that is, what is the certainty that a specific event will occur? Where-as a likelihood function is a function of parameters within the parameter space that describes the probability of obtaining the observed data.So the fundamental difference is, Probability attaches to possible results likelihood attaches to hypotheses.

## What Is The Best Way To Learn Machine Learning

Any way that suits your style of learning can be considered as the best way to learn. Different people may enjoy different methods. Some of the common ways would be through taking up a Machine Learning Course, watching YouTube videos, reading blogs with relevant topics, read books which can help you self-learn.

## What Are Overfitting And Underfitting Why Does The Decision Tree Algorithm Suffer Often With Overfitting Problem

Overfitting is a statistical model or machine learning algorithm which captures the noise of the data. Underfitting is a model or machine learning algorithm which does not fit the data well enough and occurs if the model or algorithm shows low variance but high bias.

In decision trees, overfitting occurs when the tree is designed to perfectly fit all samples in the training data set. This results in branches with strict rules or sparse data and affects the accuracy when predicting samples that arent part of the training set.

*Also Read: Overfitting and Underfitting in Machine Learning *

**Recommended Reading: What Are The Most Common Behavioral Interview Questions **

## What Is Time Series

A Time series is a sequence of numerical data points in successive order. It tracks the movement of the chosen data points, over a specified period of time and records the data points at regular intervals. Time series doesnt require any minimum or maximum time input. Analysts often use Time series to examine data according to their specific requirement.

Read also: Time Series Analysis and Forecasting

## Name A Popular Dimensionality Reduction Algorithm

Popular dimensionality reduction algorithms are Principal Component Analysis and Factor Analysis.Principal Component Analysis creates one or more index variables from a larger set of measured variables. Factor Analysis is a model of the measurement of a latent variable. This latent variable cannot be measured with a single variable and is seen through a relationship it causes in a set of** y** variables.

**You May Like: What To Email After An Interview **

## Describe A Few Different Types Of Functions In Machine Learning

The three most common functions you will use in data science and machine learning are loss functions, activation functions, and cost functions. A loss function is used to measure a true value and how far off the estimated value is from it. Loss functions are used in non-linear algorithms. They are also key in measuring the learning rate of a machine learning model.

An activation function is used in an artificial neural network. These are used to help determine whether a neuron should fire to another neuron, allowing the neural network to learn complex patterns.

A cost function can be found in a linear regression model. A cost function is used to determine the effectiveness of models. In a less effective linear model, cost functions help identify errors between predicted outcomes and actual outcomes.

## Whats A Fourier Transform

Fourier Transform is a mathematical technique that transforms any function of time to a function of frequency. Fourier transform is closely related to Fourier series. It takes any time-based pattern for input and calculates the overall cycle offset, rotation speed and strength for all possible cycles. Fourier transform is best applied to waveforms since it has functions of time and space. Once a Fourier transform applied on a waveform, it gets decomposed into a sinusoid.

**Don’t Miss: What Are My Weaknesses Job Interview **

## What Is Bias In Machine Learning

Bias in data tells us there is inconsistency in data. The inconsistency may occur for several reasons which are not mutually exclusive.

For example, a tech giant like Amazon to speed the hiring process they build one engine where they are going to give 100 resumes, it will spit out the top five, and hire those.

When the company realized the software was not producing gender-neutral results it was tweaked to remove this bias.

## Why Wouldn’t You Use Manhattan Distance To Calculate The Distance Between Nearest Neighbors With K

This question may help interviewers gauge your ability and experience using different distance calculations. You could begin your answer with recognizing the need for Euclidean distance over the Manhattan calculations.

**Example:***“Manhattan distance only calculates horizontally or vertically, at one time. Using Euclidean distance, I can calculate a distance in any space, and I’m not limited to just vertical and horizontal linear metrics.”*

**Related: How To Break Into Machine Learning in 11 Steps**

**Recommended Reading: How To Do A Group Interview **

## What Do You Mean By The Roc Curve

Receiver operating characteristics : ROC curve illustrates the diagnostic ability of a binary classifier. It is calculated/created by plotting True Positive against False Positive at various threshold settings. The performance metric of ROC curve is AUC . Higher the area under the curve, better the prediction power of the model.

## What Are Some Methods Of Reducing Dimensionality

You can reduce dimensionality by combining features with feature engineering, removing collinear features, or using algorithmic dimensionality reduction.

Now that you have gone through these machine learning interview questions, you must have got an idea of your strengths and weaknesses in this domain.

Get an overview of AI concepts, workflows, and performance metrics with the AI and Machine Learning Certification Courses.

**Also Check: How To Properly Transcribe An Interview **

## Q14 Youre Asked To Build A Random Forest Model With 10000 Trees During Its Training You Got Training Error As 000 But On Testing The Validation Error Was 3423 What Is Going On Havent You Trained Your Model Perfectly

- The model is overfitting the data.
- Training error of 0.00 means that the classifier has mimicked the training data patterns to an extent.
- But when this classifier runs on the unseen sample, it was not able to find those patterns and returned the predictions with more number of errors.
- In Random Forest, it usually happens when we use a larger number of trees than necessary. Hence, to avoid such situations, we should tune the number of trees using cross-validation.

**Q15. People who bought this also bought recommendations seen on Amazon is based on which algorithm?**

E-commerce websites like Amazon make use of Machine Learning to recommend products to their customers. The basic idea of this kind of recommendation comes from collaborative filtering. Collaborative filtering is the process of comparing users with similar shopping behaviors in order to recommend products to a new user with similar shopping behavior.

*Collaborative Filtering Machine Learning Interview Questions Edureka*

To better understand this, lets look at an example. Lets say a user A who is a sports enthusiast bought, pizza, pasta, and a coke. Now a couple of weeks later, another user B who rides a bicycle buys pizza and pasta. He does not buy the coke, but Amazon recommends a bottle of coke to user B since his shopping behaviors and his lifestyle is quite similar to user A. This is how collaborative filtering works.

## Q5 How Would You Predict Who Will Renew Their Subscription Next Month What Data Would You Need To Solve This What Analysis Would You Do Would You Build Predictive Models If So Which Algorithms

- Lets assume that were trying to predict renewal rate for Netflix subscription. So our problem statement is to predict which users will renew their subscription plan for the next month.
- Next, we must understand the data that is needed to solve this problem. In this case, we need to check the number of hours the channel is active for each household, the number of adults in the household, number of kids, which channels are streamed the most, how much time is spent on each channel, how much has the watch rate varied from last month, etc. Such data is needed to predict whether or not a person will continue the subscription for the upcoming month.
- After collecting this data, it is important that you find patterns and correlations. For example, we know that if a household has kids, then they are more likely to subscribe. Similarly, by studying the watch rate of the previous month, you can predict whether a person is still interested in a subscription. Such trends must be studied.
- The next step is analysis. For this kind of problem statement, you must use a classification algorithm that classifies customers into 2 groups:
- Customers who are likely to subscribe next month
- Customers who are not likely to subscribe next month

**Don’t Miss: How To Prepare For A Facebook Interview **

## Is Naive Bayes Supervised Or Unsupervised

First, Naive Bayes is not one algorithm but a family of Algorithms that inherits the following attributes:

1.Discriminant Functions

3.Bayesian Theorem

4.Naive Assumptions of Independence and Equal Importance of feature vectors.

Moreover, it is a special type of Supervised Learning algorithm that could do simultaneous multi-class predictions .

Since these are generative models, so based upon the assumptions of the random variable mapping of each feature vector these may even be classified as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, etc.

## Given An Array Of Integers Where Each Element Represents The Max Number Of Steps That Can Be Made Forward From That Element The Task Is To Find The Minimum Number Of Jumps To Reach The End Of The Array If An Element Is 0 Then Cannot Move Through That Element

Solution: This problem is famously called as end of array problem. We want to determine the minimum number of jumps required in order to reach the end. The element in the array represents the maximum number of jumps that, that particular element can take.

Let us understand how to approach the problem initially.

We need to reach the end. Therefore, let us have a count that tells us how near we are to the end. Consider the array A=

In the above example we can go from > 2 - > 3 - > 1 - > 1 - 4 jumps1 - > 2 - > 1 - > 1 - 3 jumps1 - > 2 - > 3 - > 1 - 3 jumps

Hence, we have a fair idea of the problem. Let us come up with a logic for the same.

Let us start from the end and move backwards as that makes more sense intuitionally. We will use variables right and prev_r denoting previous right to keep track of the jumps.

Initially, right = prev_r = the last but one element. We consider the distance of an element to the end, and the number of jumps possible by that element. Therefore, if the sum of the number of jumps possible and the distance is greater than the previous element, then we will discard the previous element and use the second elements value to jump. Try it out using a pen and paper first. The logic will seem very straight forward to implement. Later, implement it on your own and then verify with the result.

**Recommended Reading: How To Prepare For Amazon Coding Interview **

## Algorithms And Data Structures

*For more problems on data structure and algorithms, with solutions, visit the **Problems** page.*

## Skills And Sample Questions

I recently wrote a piece for the Udacity blog entitled 5 Skills You Need to Become a Machine Learning Engineer. In that article I identified five groupings for the essential skills that a Machine Learning Engineer needs:

I encourage you to read that post for further detail about these groups. What I wish to focus on here are the kinds of questions youre likely to face in a Machine Learning interview, so Ill use these groupings simply as an organizing principle.

**Read Also: What To Say In Thank You Interview Email **

## Q1 You Are Given A Data Set Consisting Of Variables Having More Than 30% Missing Values Lets Say Out Of 50 Variables 8 Variables Have Missing Values Higher Than 30% How Will You Deal With Them

- Assign a unique category to the missing values, who knows the missing values might uncover some trend.
- We can remove them blatantly.
- Or, we can sensibly check their distribution with the target variable, and if found any pattern well keep those missing values and assign them a new category while removing others.

## What Does An Apple Machine Learning Engineer Do

As an Apple machine learning engineer, you will be responsible for extracting value from available data at Apple, along with data collection, cleaning, preprocessing, training and deploying models, and production. Some of your responsibilities as an Apple machine learning engineer will be to:

- Analyze the ML algorithms to solve a given problem and rank them by their success probabilities.
- Explore and visualize data, then identify key differences in data distribution that could influence performance when deploying the model.
- Verify and ensure data quality via data cleaning.
- Define validation strategies.
- Define preprocessing or feature engineering for a given dataset.
- Train models and tune their hyper-parameters.
- Analyze errors of the model and design strategies for overcoming them.
- Deploy models to production.

**Also Check: What Will I Be Asked In An Interview **

## Explain The Difference Between Normalization And Standardization

Normalization and Standardization are the two very popular methods used for feature scaling. Normalization refers to re-scaling the values to fit into a range of . Standardization refers to re-scaling data to have a mean of 0 and a standard deviation of 1 . Normalization is useful when all parameters need to have the identical positive scale however the outliers from the data set are lost. Hence, standardization is recommended for most applications.

## Build A Portfolio For Machine Learning Job Applications: Create A Presence On Github And Kaggle

A significant challenge when it comes to job applications for machine learning engineer positions is simply getting an interview. So, how can a company find you? How can you make yourself stand out?

One answer is to work on creating and completing projects with your skillset. Try out lots of new toy projects, and use resources like Kaggle for inspiration. Participating in discussion forums is another avenue with multiple benefits you get to learn from and discuss with others while marketing yourself.

Be creative and proactive where possible. Building your profile on GitHub can really help. Write lots of code and solve a variety of problems. It can be hard to find these on your own, but taking part in Kaggle competitions is a really great place to start.

Working on programming projects is another option for building out your portfolio. When I was starting out, I worked on whatever I felt like, and whatever interested me. For a time I tried to create some games myself, but now I often try to understand research papers by implementing their systems. Its one thing to understand the theory, but its another to write code and implement systems. When you apply for a machine learning job, youll want to make sure you can do both.

**Recommended Reading: What To Ask When Interviewing Someone **

## What Is A Confusion Matrix

A confusion matrix is a table used to help describe the performance of a classification model used to test a specific set of data. For a confusion matrix to work, you will need to know all of the true values of a data set prior to using the matrix. When using a confusion matrix, you will work with true positives, true negatives, false positives, and false negatives.

The great thing about a confusion matrix is its ability to help distinguish extreme values. It uses an F-score to accurately sort them. You will likely use a confusion matrix in machine learning, as it is considered to be a deep learning model. Using a confusion matrix is a fantastic way to understand more about your data set than the human brain often can without the matrix.

## What Is The Difference Between Deep Learning And Machine Learning

Machine Learning involves algorithms that learn from patterns of data and then apply it to decision making. Deep Learning, on the other hand, is able to learn through processing data on its own and is quite similar to the human brain where it identifies something, analyse it, and makes a decision.The key differences are as follow:

- The manner in which data is presented to the system.
- Machine learning algorithms always require structured data and deep learning networks rely on layers of artificial neural networks.

**Don’t Miss: What Are Some Questions To Ask During An Interview **

## How To Ace Your Google Machine Learning Engineer Interview

Thereâs good news and bad news. The bad news is that the road to bagging a Google machine learning engineer job is long and tedious. The good news is that you do not have to do it alone!

Having trained**over 6,000 software engineers**, we know what it takes to crack the toughest tech interviews. Since 2014,**Interview Kickstart** alums have been landing lucrative offers from FAANG and Tier-1 tech companies, with an **average salary hike of 49%.** The highest ever offer received by an IK alum is a whopping **$933,000****!**

Interview Kickstart offers interview preparation courses taught by FAANG tech leads and seasoned hiring managers. With a cracking **team of instructors from FAANG** and other tier-1 companies, experienced hiring managers, and tech leads at coveted companies, Interview Kickstart is a powerhouse of expert knowledge and guidance on cracking FAANG interviews.

If you are confused about how to apply or where to start preparing, and let the experts show you how it’s done.