What are numpy, scipy, and spark essential datatypes?
How to solve multi-collinearity?
Write a function that takes in two sorted lists and outputs a sorted list that is their union?
Explain each data visualization in detail?
Explain principal components analysis with equations.
Define linear regression?
How do you take millions of users with 100's of transactions each, amongst 10000's of products and group the users together in a meaningful segments?
What is the role of activation function?
You are given 50 cards with five different colors- 10 Green cards, 10 Red Cards, 10 Orange Cards, 10 Blue cards, and 10 Yellow cards. The cards of each colors are numbered from one to ten. Two cards are picked at random. Find out the probability that the cards picked are not of same number and same color.
What is Survival Analysis?
Define a sql query? What is the difference between select and update query?
You can roll a dice three times. You will be given $X where X is the highest roll you get. You can choose to stop rolling at any time (example, if you roll a 6 on the first roll, you can stop). What is your expected pay-out?
Which is the best suitable language among python and r for text analytics?
In k-means or knn, we use euclidean distance to calculate the distance between nearest neighbors. Why not manhattan distance?
Explain how to write a table to a file?