What is an example of a data set with a non-gaussian distribution?
How do data management procedures like missing data handling make selection bias worse?
A dice is rolled twice, what is the probability that on the second chance it will be a 6?
Name some kinds of graphs and explain how you would build them in python or r.
What is a/b testing in data science?
What does the future hold for data scientists?
A stranger uses a search engine to find something and you do not know anything about the person. How will you design an algorithm to determine what the stranger is looking for just after he/she types few characters in the search box?
In k-means or knn, we use euclidean distance to calculate the distance between nearest neighbors. Why not manhattan distance?
What are the types of business decisions?
What jupyter used?
How to run a pycharm debugging?
Explain the difference between a validation set and a test set?
What is interpolation and extrapolation?
What is cart and chaid? How is bagging different from boosting?
How long should a pip last?