The Machine Learning part of the interview is usually the most elaborate one. That’s the reason we have dedicated a complete post to the interview questions from ML. We’ve also provided, wherever possible, the link to Suggested Reading material that will be helpful in answering these questions.

We update these links from time to time and if you have any solution you can suggest please feel free to post it. You should explore these questions thoroughly, especially the ones that may relate to your previous experience and projects.

General ML Questions

Here is a nice post covering various aspects of machine learning that’ll be a good starting point.

  • How will you differentiate a machine learning algorithm from other algorithms?

    Suggested reading

  • What’s the difference between data mining and machine learning?
  • What are the advantages of machine learning?

    Often much more accurate than human-crafted rules (since data driven) • humans often incapable of expressing what they know (e.g., rules of English, or how to recognize letters), but can easily classify examples • don’t need a human expert or programmer • cheap and flexible — can apply to any learning task

  • Describe some popular machine learning methods.

    Suggested Reading

  • How will you differentiate between supervised and unsupervised learning? Give few examples of algorithms for supervised learning?

    Suggested reading

  • What is your favorite ML algorithm? How will you explain it to a layman? Why is it your favourite?

Regression

  • Is regression some type of supervised learning? Why?
  • Explain the tradeoff between bias and variance in a regression problem.
  • A learning algorithm with low bias and high variance may be suitable under what circumstances?
  • What is regression analysis?
  • What do coefficient estimates mean?
  • How do you measure fit of the model? What do R and D mean?
  • What are some possible problems with regression models? How do you avoid or compensate for them?
  • Name a few types of regression you are familiar with? What are the differences?
  • What are the downfalls of using too many or too few variables for performing regression?

    Linear Regression

    Suggested reading on difference between linear and non-linear regression

  • What is linear regression? Why is it called linear?
  • What are the constraints you need to keep in mind when using a linear regression?
  • How does the variance of the error term change with the number of predictors, in OLS?
  • In linear regression, under what condition R^2 always equals a perfect 1?
  • Do you consider the models Y~X1+X2+X1X2 and Y~X1+X2+X1X2 to be linear? Why?

    Suggested reading

  • Do we always need the intercept term? When do we need it and when do we not?

    Suggested reading

  • What is collinearity and what to do with it?

    Suggested reading

  • How to remove multicollinearity?

    Suggested reading

  • What is overfitting a regression model? What are ways to avoid it?

    Suggested reading

  • What is Ridge Regression? How is it different from OLS Regression? Why do we need it?
  • What is Lasso regression? How is it different from OLS and Ridge?
  • What are the assumptions that standard linear regression models with standard estimation techniques make?
  • How can some of these assumptions be relaxed?
  • You fit a multiple regression to examine the effect of a particular feature. The feature comes back insignificant, but you believe it is significant. How will you explain it?
  • Your model considers the feature X significant, and Z is not, but you expected the opposite result. How will you explain it?
  • How to check if the regression model fits the data well?
  • When to use k-Nearest Neighbors for regression?
  • Could you explain some of the extension of linear models like Splines or LOESS/LOWESS?

    Classification

    Basic Questions

  • State some real life problems where classification algorithms can be used?

    Text categorization (e.g., spam filtering) • fraud detection • optical character recognition • machine vision (e.g., face detection) • natural-language processing (e.g., spoken language understanding) • market segmentation (e.g.: predict if customer will respond to promotion) • bioinformatics (e.g., classify proteins according to their function) etc.

  • What is the simplest classification algorithm?

    Many consider Logistic Regression as a simple approach to begin with in order to to set a baseline and only make it more complicated if need be.

  • What is your favourite ML algorithm? Why is it your favourite? How will you describe it to a non-technical person.

    Decision Trees

    To answer questions on decision trees here are some useful links:
    Youtube video tutorial
    This article covers decision tree in depth

    Other suggested reading

  • What is a decision tree?
  • What are some business reasons you might want to use a decision tree model?
  • How do you build a decision tree model?
  • What impurity measures do you know?
  • Describe some of the different splitting rules used by different decision tree algorithms.
  • Is a big brushy tree always good?
  • How will you compare a decision tree to a logistic regression? Which is more suitable under different circumstances?
  • What is pruning and why is it important?

    Ensemble models:
    To answer questions on ensemble models here is a useful link:

  • Why do we combine multiple trees?
  • What is Random Forest? Why would you prefer it to SVM?

    Logistic regression:
    Link to understand basics of Logistic regression
    Here’s a nice tutorial from Khan Academy

  • What is logistic regression?
  • How do we train a logistic regression model?
  • How do we interpret its coefficients?

    Support Vector Machines
    A tutorial on SVM can be found here and here

  • What is the maximal margin classifier? How this margin can be achieved and why is it beneficial?
  • How do we train SVM? What about hard SVM and soft SVM?
  • What is a kernel? Explain the Kernel trick
  • Which kernels do you know? How to choose a kernel?

    Neural Networks
    Here’s a link to Neural Network course from Hinton on Coursera

  • What is an Artificial Neural Network?
  • How to train an ANN? What is back propagation?
  • How does a neural network with three layers (one input layer, one inner layer and one output layer) compare to a logistic regression?
  • What is deep learning? What is CNN (Convolution Neural Network) or RNN (Recurrent Neural Network)?

    Other models:

  • What other models do you know?
  • How can we use Naive Bayes classifier for categorical features? What if some features are numerical?
  • Tradeoffs between different types of classification models. How to choose the best one?
  • Compare logistic regression with decision trees and neural networks.

    Regularization

    Suggested Reading: wikipedia and Quora answers

  • What is Regularization?
  • Which problem does Regularization try to solve?

    Ans. used to address the overfitting problem, it penalizes your loss function by adding a multiple of an L1 (LASSO) or an L2 (Ridge) norm of your weights vector w (it is the vector of the learned parameters in your linear regression).

  • What does it mean (practically) for a design matrix to be “ill-conditioned”?
  • When might you want to use ridge regression instead of traditional linear regression?
  • What is the difference between the L1 and L2 regularization?
  • Why (geometrically) does LASSO produce solutions with zero-valued coefficients (as opposed to ridge)?

    Dimensionality Reduction

    Suggested Reading: Scikit and Kdnuggets

  • What is the purpose of dimensionality reduction and why do we need it?
  • Are dimensionality reduction techniques supervised or not? Are all of them are (un)supervised?
  • What ways of reducing dimensionality do you know?
  • Is feature selection a dimensionality reduction technique?
  • What is the difference between feature selection and feature extraction?
  • Is it beneficial to perform dimensionality reduction before fitting an SVM? Why or why not?

    Principal Component Analysis

  • What is Principal Component Analysis (PCA)? Under what conditions is PCA effective? How is it related to eigenvalue decomposition (EVD)?
  • What are the differences between Factor Analysis and Principal Component Analysis?
  • How will you use SVD to perform PCA? When SVD is better than EVD for PCA?
  • Why do we need to center data for PCA and what can happen if we don’t do it?
  • Do we need to normalize data for PCA? Why?
  • Is PCA a linear model or not? Why?

    Other Dimensionality Reduction techniques:

  • Do you know other Dimensionality Reduction techniques?
  • What is Independent Component Analysis (ICA)? What’s the difference between ICA and PCA?
  • Suppose you have a very sparse matrix where rows are highly dimensional. You project these rows on a random vector of relatively small dimensionality. Is it a valid dimensionality reduction technique or not?
  • Have you heard of Kernel PCA or other non-linear dimensionality reduction techniques? What about LLE (Locally Linear Embedding) or tt-SNE (tt-distributed Stochastic Neighbor Embedding)
  • What is Fisher Discriminant Analysis? How it is different from PCA? Is it supervised or not?

    Cluster Analysis

    Suggested reading: tutorialspoint and Lecture notes

  • Why do you need to use cluster analysis?
  • Give examples of some cluster analysis methods?
  • Differentiate between partitioning method and hierarchical methods.
  • Explain K-Means and its objective?
  • How do you select K for K-Means?
  • How would you assess the quality of clustering?

    Optimization

    Here is a good video to learn about optimization.

    Some basic questions about optimization

  • Give examples of some convex and non-convex algorithms.

    Examples of convex optimisation problems in machine learning

    linear regression/ Ridge regression, with Tikhonov regularisation, etc; sparse linear regression with L1 regularisation, such as lasso; Support vector machines; Parameter estimation in linear-Gaussian time series (Kalman filter and friends)

    Typical examples of non-convex optimization in ML are

    Neural networks; maximum likelihood mixtures of Gaussians

  • What is Gradient Descent Method?
  • Tell us the difference between Batch Gradient Descent and Stochastic gradient descent.
  • Give examples of some convex optimization problems in machine learning
  • Give examples of the algorithms using Gradient based methods of second order information.
  • Does Gradient Descent methods always converge to the same point?
  • Is it necessary that the Gradient Descent Method will always find the global minima?
  • What is a local optimum is and why is it important in a specific context, such as k-means clustering. What are specific ways for determining if you have a local optimum problem? What can be done to avoid local optima? Read possible answer

    Suggested Reading

  • Explain the Newton’s method?

    Suggested Reading

  • What kind of problems are well suited for Newton’s method? BFGS? SGD?
  • What are “slack variables”?
  • Describe a constrained optimization problem and how you would tackle it.

    Recommendation

    some good examples of recommender models can be found here

  • What is a recommendation engine? How does it work?
  • How to do customer recommendation?
  • What is Collaborative Filtering?
  • How would you generate related searches for a search engine?
  • How would you suggest followers on Twitter?
  • Do you know about the Netflix Prize problem? How would you approach it?

    Here is a nice post on the Netflix challenge

    Feature Engineering

    Here is a good article on feature engineering

  • What is Feature Engineering?

    How predictors are encoded in a model can have a signi?cant impact on model performance and we achieve such encoding through feature engineering. Sometimes using combinations of predictors can be more e?ective than using the individual values: the product of two predictors may be more e?ective than using two independent predictors. Often the most e?ective encoding of the data is captured by the modeler’s understanding of the problem and thus is not derived from any mathematical technique.
    These features can be extracted in two ways: 1. By a human expert (known as hand-crafted) or 2. By using automated feature extraction methods such as PCA, or Deep Learning tools such as DBN. Both 1 and 2 can be used on top of each other.

  • Give an example where feature example can be very useful in predicting results from data and explain with reason why it is so effective in some cases?
  • What are some good ways for performing feature selection that do not involve exhaustive search?
  • How to convert categorical variables to numerical for extracting features?

    Feature Selection

    Here is a nice post on feature selection,
    also known as variable selection, attribute selection or variable subset selection

  • Explain feature selection and its importance with examples.
  • What is variance threshold approach?
  • How Univariate feature selection works?
  • Is there any negative impact of using too many or too few variables?
  • Is there any thumb rule for the number of features that should be used? How do you select the best features?
  • What will be your approach to recursive feature elimination?
  • Describe some feature selection methods.
  • Does the model affect the choice of feature selection method?

    Natural Language Processing (NLP)

    For basic introduction visit the wiki page.
    Here is the link to coursera course for NLP
    Pick the software from the The Stanford NLP (Natural Language Processing) Group and input some text to view its parse tree, named entities, part of speech tags, etc.
    If the company deals with text data, you can expect some questions on NLP and Information Retrieval:

  • Explain NLP to a non-technical person.
  • What’s the use of NLP in Machine Learning?

    Some interesting usages are in areas like sentiment analysis, spam detecting, POS, Text summarization, Language translation etc.

  • How unstructured text data can be converted into structured data for the purpose of ML models?
  • Explain Vector Space Model and its use?
  • Explain the distances and similarity measures that can be used to compare documents?
  • Explain cosine similarity in a simple way?

    Suggested Reading

  • Why and when stop words are removed? In which situation we do not remove them?

    Image processing and Text mining

  • What tool would you prefer for image processing?

    Some popular tools are: MATLAB, OpenCV or Octave

  • What parameters would you consider while selecting a tool for image processing?

    Ease of use, speed and resources needed are some of the common parameters

  • How to apply Machine Learning to images?
  • What are the text mining tools you are familiar with?

    Some example are:
    Commercial: Autonomy, Lexalytics , SAS/SPSS, SQLServer 2008+
    OpenSource: RapidMiner , NClassifier, OpenTextSumarizer, WordNet, OpenNLP/SharpNLP, Lucene/Lucene.NET, LingPipe, Weka

  • What techniques do you apply for processing texts? Explain with an example.
  • How to apply Machine Learning to audio data?

    Meta Learning

    Wiki link on meta learning

  • How will you differentiate between boosting and inductive transfer?

    Model selection

  • What criteria would you use while selecting the best model from many different models?
  • You have one model and want to find the best set of parameters for this model. How would you do that?
  • How would you use model tuning for arriving at the best parameters?

    Suggested Reading

  • Explain grid search and how you would use it?
  • What is Cross-Validation?
  • What is 10-Fold CV?
  • What is the difference between holding out a validation set and doing 10-Fold CV?

    Evaluating Machine Learning

  • How do you know if your model overfits?
  • How do you assess the results of a logistic regression?
  • Which evaluation metrics you know? Something apart from accuracy?
  • Which is better: Too many false positives or too many false negatives?
  • What precision and recall are?
  • What is a ROC curve? Write pseudo-code to generate the data for such a curve.
  • What is AU ROC (AUC)?
  • Do you know about Concordance or Lift?

    Discussion Questions

  • You have a marketing campaign and you want to send emails to users. You developed a model for predicting if a user will reply or not. How can you evaluate this model? Is there a chart you can use?

    Miscellaneous

    Curse of Dimensionality

  • What is Curse of Dimensionality? What is the difference between density-sparse data and dimensionally-sparse data?

    Suggested Reading

  • Dealing with correlated features in your data set, how to reduce the dimensionality of data.
  • What are the problems of large feature space? How does it affect different models, e.g. OLS? What about computational complexity?
  • What dimensionality reductions can be used for preprocessing the data?
  • What is the difference between density-sparse data and dimensionally-sparse data?

    Others

  • You are training an image classifier with limited data. What are some ways you can augment your dataset?

from: http://analyticscosm.com/machine-learning-interview-questions-for-data-scientist-interview/

机器学习面试题合集Collection of Machine Learning Interview Questions相关推荐

  1. 2023年网络安全工程师面试题合集【首发】

    以下为信息安全各个方向涉及的面试题,星数越多代表问题出现的几率越大,祝各位都能找到满意的工作~ [一一帮助安全学习[点我]一一]①网络安全学习路线②20 份渗透测试电子书③安全攻防 357 页笔记④5 ...

  2. 9012年大厂面试题合集:Java技术栈为什么竞争越来越激烈?

    就今年大环境来看,跳槽成功的难度比往年高很多,一个明显的感受:今年的Java技术栈面试,无论一面还是二面,都特别考验Java程序员的技术功底. 最近有人搜集了93套腾讯.阿里.美团.百度.网易等公司9 ...

  3. Java面试核心知识点(283页)Java面试题合集最新版(485页)

    阿里.腾讯两大互联网企业传来裁员消息,很多人都陷入担心,不安情绪蔓延-- 其实大家应该更冷静和理性地看待大厂裁员.每年三四月都是大厂人员调整期,这个季节是各个公司战略调整.战略规划的一个关键期,肯定会 ...

  4. 大学“电路分析基础”试题合集第四章

    大学"电路分析基础"试题合集第一章 大学"电路分析基础"试题合集第二章 大学"电路分析基础"试题合集第三章 "电路分析基础&quo ...

  5. 大学“电路分析基础”试题合集第六章(文末附PDF文档与Word文档)

    大学"电路分析基础"试题合集第一章 大学"电路分析基础"试题合集第二章 大学"电路分析基础"试题合集第三章 大学"电路分析基础&q ...

  6. 2019年大厂面试题合集:Java架构师技术栈为什么竞争越来越激烈?程序员必看!

    2019年大厂面试题合集:Java架构师技术栈为什么竞争越来越激烈?程序员必看! 就今年大环境来看,跳槽成功的难度比往年高很多,一个明显的感受:今年的Java技术栈面试,无论一面还是二面,都特别考验J ...

  7. 中级java笔试题_Java中级面试题合集

    Java中级面试题合集:1.弹出式选择菜单(Choice)和列表(List)有什么区别 Choice是以一种紧凑的形式展示的,需要下拉才能看到所有的选项.Choice中一次只能选中一个选项.List同 ...

  8. 计算机考试中英文打字题,最新计算机信息技术(五笔及中英文打字测试试题合集...

    最新计算机信息技术(五笔及中英文打字测试试题合集,搜狗五笔切换中英文,万能五笔 中英文切换,电脑打字五笔和中英文怎么转换,万能五笔中英文切换快捷键,五笔输入法怎么切换中英文,打字测试,在线打字测试,书 ...

  9. 历年计算机一级b考试试题及答案,全国计算机等级考试一级B历年试题合集含答案...

    好多原题 1. 全国计算机等级考试一级B历年试题合集含答案(CHM文件下载)>> (1)计算机的特点是处理速度快.计算精度高.存储容量大.可靠性高.工作全自动以及 A)造价低廉 B)便于大 ...

最新文章

  1. shell脚本中的输入输出
  2. [转]OpenContrail 体系架构文档
  3. IT人的好习惯和不良习惯总结
  4. 数据 3 分钟 | 余承东正式发布GaussDB(for openGauss)、浪潮宣布云溪数据库ZNBase开源...
  5. (转)解决Google Adsense广告只显示英文的问题
  6. 编历修改工作表中的控件属性(更新条形码)
  7. springMVC整合mybatis 项目遇到问题总结
  8. jdk和cglib动态代理
  9. 产品经理面试题(面试经历)
  10. wps打印缩放到一页_WPS文档过长时,如何在A4纸上完美打印
  11. BlockingQueue(阻塞队列)
  12. 从简单的信道估计说起
  13. Memcached 简介
  14. 上周丢钥匙事件的反思感悟
  15. 地铁和轻轨的区别, 中国目前有几个城市有地铁
  16. 中小企业面临“招聘难”
  17. 线性代数之矩阵我们需要了解的知识点(增广矩阵矩阵的迹 矩阵的秩阶梯型...)
  18. 小故事(和男友的QQ聊天记录)
  19. 电子元器件知识及术语资料下载
  20. 从创意到完成更快的Vegas Pro 16 Edit

热门文章

  1. 40个良好用户界面Tips
  2. Python数据结构——list
  3. AndroidStudio导入httpmime jar编译不通过的解决办法
  4. java拆解_深入拆解Java虚拟机视频教程
  5. linux检查是否安装proc编译器,编译安装 GCC 4.9并验证使用
  6. element ui点击按钮弹出款_前端猿应该知道的十大最流行的前端UI框架
  7. Java实现List中某个对象属性中的字符串参数首字母进行排序
  8. 盘点一下全网最有趣的代码注释
  9. LtScrollImageView:自动滚动的广告图片展示栏
  10. 赛门铁克卸载工具_神奇的安卓恶意软件 xHelper:自卸载且无法删除