What Is The General Principle Of An Ensemble Method And What Is Bagging And Boosting In Ensemble Method
The general principle of an ensemble method is to combine the predictions of several models built with a given learning algorithm in order to improve robustness over a single model. Bagging is a method in ensemble for improving unstable estimation or classification schemes. While boosting method are used sequentially to reduce the bias of the combined model. Boosting and Bagging both can reduce errors by reducing the variance term.
Explain Correlation And Covariance
Correlation: Correlation tells us how strongly two random variables are related to each other. It takes values between -1 to +1.
Formula to calculate Correlation:
Covariance: Covariance tells us the direction of the linear relationship between two random variables. It can take any value between – and + .
Formula to calculate Covariance:
What Is Naive Bayes Why Is It Naive
Naive Bayes classifiers are a series of classification algorithms that are based on the Bayes theorem. This family of algorithm shares a common principle which treats every pair of features independently while being classified.
Naive Bayes is considered Naive because the attributes in it is independent of others in the same class. This lack of dependence between two attributes of the same class creates the quality of naiveness.Read more about Naive Bayes.
Recommended Reading: What Are Some Questions And Answers For A Job Interview
How Will You Know Which Machine Learning Algorithm To Choose For Your Classification Problem
While there is no fixed rule to choose an algorithm for a classification problem, you can follow these guidelines:
- If accuracy is a concern, test different algorithms and cross-validate them
- If the training dataset is small, use models that have low variance and high bias
- If the training dataset is large, use models that have high variance and little bias
What Do You Understand By Selection Bias
In statistical terms, bias is the sampling of data on the basis of population. Take an example, when you want to get information about the use of gaming computers in some specific state. To get accurate information you have to take data from all the prevailing markets that are dealing with gaming computers in that state.
If you assume to get data from one city you can be called bias on the collection of data. You are not collecting the data from all over the state. This may produce wrong conclusion.
Recommended Reading: How To Be Ready For A Phone Interview
What Is Roc Curve
ROC curve is a graphical plot to illustrate the ability of a classifier system. Basically, this curve tells you how much a binary classifier system is capable of distinguishing between classes. This curve is plotted with TPR on the y-axis and FPR on the x-axis. TPR is also known as sensitivity recall or probability of detection and FPR is also known as the probability of false alarm.
Machine Learning Interview Questions And Answers
This Edureka video on Machine Learning Interview Questions and Answers will help you to prepare yourself for Data Science / Machine Learning interviews.
In this blog on Machine Learning Interview Questions, I will be discussing the top Machine Learning related questions asked in your interviews. So, for your better understanding I have divided this blog into the following 3 sections:
- It is like learning under the guidance of a teacher
- Training dataset is like a teacher which is used to train the machine
- Model is trained on a pre-defined dataset before it starts making decisions when given new data
- It is like learning without a teacher.
- Model learns through observation & finds structures in data.
- Model is given a dataset and is left to automatically find patterns and relationships in that dataset by creating clusters.
- It is like being stuck in an isolated island, where you must explore the environment and learn how to live and adapt to the living conditions on your own.
- Model learns through the hit and trial method
- It learns on the basis of reward or penalty given for every action it performs
Does The Job Assistance Program Guarantee Me A Job
Apparently, no. Our job assistance program is aimed at helping you land in your dream job. It offers a potential opportunity for you to explore various competitive openings in the corporate world and find a well-paid job, matching your profile. The final decision on hiring will always be based on your performance in the interview and the requirements of the recruiter.
How To Answer Machine Learning Coding Questions
Answering machine learning coding questions is similar to generic coding questions. We recommend following a few steps.
Recommended Reading: Sample Product Manager Interview Questions
Life As A Machine Learning Engineer
Careers as a machine learning engineer are quickly becoming one of the most sought-after positions in the IT field. More companies are adopting AI technologies, including machine learning, and even more plan on doing so within the next five years. This means theyre going to be looking to bring on machine learning engineers that will help them acclimate to the new technologies and integrate them more efficiently into their operations.
The life of a machine learning engineer looks similar to that of a computer programmer, except theyre focused on creating programs that provide machines with the capabilities to self-learn and act without the direction of a person or specific program. Machine learning engineers can find exciting positions in a variety of industries, many of which will enable them to have a significant contribution to how society interacts with technology and how it enhances our lives.
An individual who seeks a position as a machine learning engineer has an exciting career path ahead of them. In addition to developing applications that enable machines to self-learn and perform without specific human programming, machine learning engineers can work towards a position as an architect who works to develop application prototypes.
Machine learning engineers can work in a range of professional capacities, filling positions that include:
- Machine learning engineer
- Machine learning research scientist
- Data scientist positions
What Is Bayess Theorem In Machine Learning
Bayess theorem offers the probability of any given event to occur using prior knowledge. In mathematical terms, it can be defined as the true positive rate of the given sample condition divided by the sum of the true positive rate of the said condition and the false positive rate of the entire population.
Two of the most significant applications of Bayess theorem in Machine Learning are Bayesian optimization and Bayesian belief networks. This theorem is also the foundation behind the Machine Learning brand that involves the Naive Bayes classifier.
Don’t Miss: How To Prepare For An Executive Interview
Machine Learning Interview Questions: Company/industry Specific
These machine learning interview questions deal with how to implement your general machine learning knowledge to a specific companys requirements. Youll be asked to create case studies and extend your knowledge of the company and industry youre applying for with your machine learning skills.
Q37: What do you think is the most valuable data in our business?
Answer: This question or questions like it really try to test you on two dimensions. The first is your knowledge of the business and the industry itself, as well as your understanding of the business model. The second is whether you can pick how correlated data is to business outcomes in general, and then how you apply that thinking to your context about the company. Youll want to research the business model and ask good questions to your recruiterand start thinking about what business problems they probably want to solve most with their data.
Q38: How would you implement a recommendation system for our companys users?
Answer: A lot of machine learning interview questions of this type will involve the implementation of machine learning models to a companys problems. Youll have to research the company and its industry in-depth, especially the revenue drivers the company has, and the types of users the company takes on in the context of the industry its in.
More reading: How to Implement A Recommendation System?
Q3 Theres A Game Where You Are Asked To Roll Two Fair Six
- The first condition states that if the sum of the values on the 2 dices is equal to 7, then you win $21. But for all the other cases you must pay $5.
- First, lets calculate the number of possible cases. Since we have two 6-sided dices, the total number of cases => 6*6 = 36.
- Out of 36 cases, we must calculate the number of cases that produces a sum of 7
- Possible combinations that produce a sum of 7 is, , , , , and . All these 6 combinations generate a sum of 7.
- This means that out of 36 chances, only 6 will produce a sum of 7. On taking the ratio, we get: 6/36 = 1/6
- So this suggests that we have a chance of winning $21, once in 6 games.
- So to answer the question if a person plays 6 times, he will win one game of $21, whereas for the other 5 games he will have to pay $5 each, which is $25 for all five games. Therefore, he will face a loss because he wins $21 but ends up paying $25.
Read Also: How To Ace The Coding Interview
Why Is Rotation Required In Pca What Will Happen If The Components Are Not Rotated
Rotation is a significant step in principal component analysis Rotation maximizes the separation within the variance obtained by the components. This makes the interpretation of the components easier.
The motive behind conducting PCA is to choose fewer components that can explain the greatest variance in a dataset. When rotation is performed, the original coordinates of the points get changed. However, there is no change in the relative position of the components.
If the components are not rotated, then there needs to be more extended components to describe the variance.
Considering A Long List Of Machine Learning Algorithms Given A Data Set How Do You Decide Which One To Use
There is no master algorithm for all situations. Choosing an algorithm depends on the following questions:
- How much data do you have, and is it continuous or categorical?
- Is the problem related to classification, association, clustering, or regression?
- Predefined variables , unlabeled, or mix?
- What is the goal?
Based on the above questions, the following algorithms can be used:
FREE Machine Learning Certification Course
Visualise Tasks You Might Be Expected To Carry Out And Practice How You Might Approach Them
It is common for interviewers to give a sample task they have done and ask how you would approach it. System design interviews are almost always set up this way. ASOS for example can ask the candidate to design a system for predicting a consumers likelihood of not returning to their website. Another common question is design a web crawler that gathers training samples for an NLP model.
What Do You Mean By The Roc Curve
Receiver operating characteristics : ROC curve illustrates the diagnostic ability of a binary classifier. It is calculated/created by plotting True Positive against False Positive at various threshold settings. The performance metric of ROC curve is AUC . Higher the area under the curve, better the prediction power of the model.
Don’t Miss: How To Interview A Realtor
Review Recent Machine Learning Projects
Most hiring managers prepare questions from your previous project using GitHub repositories, resume, and portfolio. They will ask you to explain how you can overcome certain issues in a specific project. Dont get overwhelmed just review your portfolio projects. Dont forget, you can use DataCamp Workspace to showcase your projects.
What Is Linear Regression In Machine Learning
Linear Regression is a supervised Machine Learning algorithm. It is used to find the linear relationship between the dependent and independent variables for predictive analysis.
The equation for Linear Regression:
- X is the input or independent variable
- Y is the output or dependent variable
- a is the intercept, and b is the coefficient of X
Below is the best-fit line that shows the data of weight, Y or the dependent variable, and the
ata of height, X or the independent variable, of 21-year-old candidates scattered over the plot. The straight line shows the best linear relationship that would help in predicting the weight of candidates according to their height.
To get this best-fit line, the best values of a and b should be found. By adjusting the values of a and b, the errors in the prediction of Y can be reduced.
This is how linear regression helps in finding the linear relationship and predicting the output.
Get 100% Hike!
Master Most in Demand Skills Now !
Recommended Reading: Help Me With My Interview
How To Prepare For Machine Learning Coding Questions
Although the list contains only 5 algorithms, memorizing the code line by line is rather unrealistic . Instead, focus on understanding and internalizing the algorithms. Then, you will feel much more confident and comfortable with the implementation. Here is how to study and practice by yourself.
Familiarize Yourself with the Algorithms
Before implementation, its essential to understand the algorithm steps clearly. Again, we recommend Andrew Ngs machine learning class for reviewing the algorithms.
Writing code in Python on a Jupyter notebook is highly recommended for debugging and testing purposes.
Hard Ml Interview Questions
18. Describe the idea behind boosting. Give an example of one method and describe one advantage and disadvantage it has?
â19. Say we are running a probabilistic linear regression which does a good job modeling the underlying relationship between some y and x. Now assume all inputs have some noise Îµ added, which is independent of the training data. What is the new objective function? How do you compute it?
â20. What is the loss function used in k-means clustering for k clusters and n sample points? Compute the update formula using 1) batch gradient descent, 2) stochastic gradient descent for the cluster mean for cluster k using a learning rate Îµ.
â21. You’re working with several sensors that are designed to predict a particular energy consumption metric on a vehicle. Using the outputs of the sensors, you build a linear regression model to make the prediction. There are many sensors, and several of the sensors are prone to complete failure. What are some cost functions you might consider, and which would you decide to minimize in this scenario?
â22. Say we are using a Gaussian Mixture Model for anomaly detection on fraudulent transactions to classify incoming transactions into K classes. Describe the model setup formulaically and how to evaluate the posterior probabilities and log likelihood. How can we determine if a new transaction should be deemed fraudulent?
â25. Formulate the background behind an SVM, and show the optimization problem it aims to solve.
Don’t Miss: Best Questions To Ask In Sales Interview
How To Answer Machine Learning Basics Questions
The key to answering this kind of question is to be concise and organized. Here is our suggested answer outline.
Here is an example Q& A:
Q: What is overfitting and how do you deal with overfitting?
A: Overfitting happens when the learning power of a model is too high or the data size is too small. The model ends up fitting the noise rather than the useful information of the data. So the model performs badly on unobserved datasets.
A: For example, we can encounter an overfitting problem when we have a regression model and the number of data points is less than the number of features.
A: There are a few approaches to deal with overfitting. One way is to use regularization to shrink the learned parameters. L2 regularization can keep the parameter values from going too extreme. While L1 regularization can help remove unimportant features. Another way is to use a simpler model to fit the data. Also, we can increase the training data.
What Are The Different Modes Of Training That Intellipaat Provides
At Intellipaat, you can enroll in either the instructor-led online training or self-paced training. Apart from this, Intellipaat also offers corporate training for organizations to upskill their workforce. All trainers at Intellipaat have 12+ years of relevant industry experience, and they have been actively working as consultants in the same domain, which has made them subject matter experts. Go through the sample videos to check the quality of our trainers.
You May Like: What To Wear For A Video Interview