Explain how do you ensure you're not overfitting with a model?
What is the “curse of dimensionality?
Is naive bayes a supervised or unsupervised method?
How to decide one problem is a machine learning problem or not?
Why do we convert categorical variables into factor? Which function is used in r to perform the same?
Tell us what do you think of our current data process?
Is octave good for machine learning?
Differentiate between inductive learning and deductive learning?
What is data augmentation? Can you give some examples?
What is bucketing in machine learning?
What is the curse of dimensionality? Can you list some ways to deal with it?
What do you know about bayesian networks?
What are the common ways to handle missing data in a dataset?
An example where ensemble techniques might be useful?
Do you know what's the “kernel trick” and how is it useful?