Data Science with Python

Python is one of the best choice for Data Science professionals, if you are looking to build a career in data science then you must have a sound knowledge of Python. These interview questions can help you to build a solid foundation of your career.

Q.1 What do you understand by feature vectors?
We define a feature vector as an n-dimensional vector of numerical features that represent some object. In machine learning, feature vectors are primarily used to represent numeric or symbolic characteristics, referred as features, of an object in a mathematical, easily analyzable way.
Q.2 What are steps in making a decision tree?
The steps in making a decision tree -
1. Take the entire data set as input.
2. Look for a split that maximizes the separation of the classes. A split is any test that divides the data into two sets.
3. Apply the split to the input data.
4. Then re-apply steps 1 to 2 to the divided data.
5. Stop when you meet some stopping criteria.

This step is called pruning. Clean up the tree if you went too far doing splits.
Q.3 What do you understand by root cause analysis?
Root cause analysis has been developed to analyze industrial accidents but is now widely used in other areas. Root cause analysis is a problem-solving technique used for isolating the root causes of faults or problems. Such that a factor is referred as a root cause if its deduction from the problem-fault-sequence averts the final undesirable event from reoccurring.
Q.4 What do you understand by logistic regression?
Logistic Regression referred as the logit model. It is a technique to forecast the binary outcome from a linear combination of predictor variables.
Q.5 What do you understand by Recommender Systems?
Recommender systems are a subclass of information filtering systems that are users to predict the preferences or ratings that a user would give to a product.
Q.6 What do you understand by cross-validation?
Cross-Validation is a model validation technique used for evaluating how the outcomes of a statistical analysis will generalize to an independent data set. This technique is mainly used in backgrounds where the objective is forecast and one wants to estimate how accurately a model will accomplish in practice.
Q.7 How do you define Collaborative Filtering?
Collaborative Filtering refers to the process of filtering used by most recommender systems to find patterns and information by collaborating perspectives, numerous data sources, and several agents.
Q.8 Can you say that gradient descent methods at all times converge to a similar point?
No we cannot say that gradient descent methods at all times converge to a similar point, because in some cases they reach a local minima or a local optima point. You would not reach the global optima point since this is governed by the data and the starting conditions.
Q.9 What is the primary goal of A/B Testing?
A/B Testing is a statistical hypothesis testing used for randomized experiments with two variables, A and B. The objective of A/B testing is to detect any changes to a web page to maximize or increase the outcome of a strategy.
Q.10 How do you define Law of Large Numbers?
Law of Large Numbers is a theorem that describes the result of performing the same experiment a large number of times. Law of Large Numbers theorem forms the basis of frequency-style thinking. It infers that the sample mean, the sample variance and the sample standard deviation converge to what they are trying to estimate.
Q.11 What do you understand by confounding variables?
Confounding variables refers to extraneous variables in a statistical model that correlate directly or inversely with both the dependent and the independent variable. Such that the estimate fails to account for the confounding factor.
Q.12 What do you understand by star schema?
Star schema is a traditional database schema with a central table. Satellite tables map IDs to physical names or descriptions and can be connected to the central fact table using the ID fields; such that these tables are referred as lookup tables and are principally useful in real-time applications, as they save a lot of memory. Star schemas involve several layers of summarization to recover information faster.
Q.13 How often must an algorithm be updated?
We would want to update an algorithm when -

1. The model to evolve as data streams through infrastructure
2. The underlying data source is changing
3. There is a case of non-stationarity
Q.14 What do you understand by Eigenvalue and Eigenvector?
Eigenvectors are used for understanding linear transformations. In the process of data analysis, we usually calculate the eigenvectors for a correlation or covariance matrix. On the other hand Eigenvalues are the directions along which a particular linear transformation acts by flipping, compressing or stretching.
Q.15 What is the purpose of resampling ?
The purpose of resampling is done in the following cases -
1. Estimating the accuracy of sample statistics by using subsets of accessible data or drawing randomly with replacement from a set of data points
2. Substituting labels on data points when performing significance tests
3. Validating models by using random subsets
Q.16 What do you understand by selective bias?
We can define selection bias, as is a problematic situation in which error is introduced due to a non-random population sample.
Q.17 During Sampling, what are the types of biases that can occur?
The types of biases that can occur in sampling are - Selection bias, Under coverage bias and Survivorship bias
Q.18 What is survivorship bias?
Survivorship bias is the logical error of focusing aspects that support surviving some process and casually overlooking those that did not due to the lack of prominence. This can lead to wrong conclusions in numerous different means.
Q.19 How will you work towards a random forest?
Given the underlying principle of this technique several weak learners combined to provide a strong learner. Some of the steps involved are -
1. Build several decision trees on bootstrapped training samples of data 2. Secondly on each tree, each time a split is considered, a random sample of mm predictors is chosen as split candidates, out of all pp predictors
Rule of thumb: At each split m=p√m=p
Predictions: At the majority rule
Q.20 What is the usage of decorators?
Decorators in Python is primarily used to modify or inject code in functions or classes. By using decorators, we can wrap a class or function method call so that a piece of code can be executed before or after the execution of the original code. Also decorators can be used to check for permissions, modify or track the arguments passed to a method, logging the calls to a specific method, etc.
Q.21 What is logistic regression and how does it work?
By estimating probability using its underlying logistic function, logistic regression evaluates the connection between the dependent variable (our label for what we want to predict) and one or more independent variables (our features) (sigmoid).
Q.22 Describe the stages involved in creating a decision tree.
Assume that the complete data set is being used as input. Calculate the target variable's entropy as well as the predictor characteristics' entropy. Calculate the total amount of information gained from all qualities (we gain information on sorting different objects from each other) As the root node, choose the property with the biggest information benefit. Repeat the technique on each branch until the decision node of each branch has been reached.
Q.23 What are Python functions?
A function is a chunk of code that is only executed when the function is called.
Q.24 In Python, what is __init__?
__init__ is a reserved function in Python classes that is equivalent to constructors in OOP terminology. When a new object is created, the __init__ method is automatically invoked. As soon as the new object is created, this procedure allocates memory to it. Variables can also be initialised using this way.
Q.25 What are the most prevalent Python built-in data types?
Python includes the following built-in data types: Immutable data types: Number String Tuple Mutable data types: List Dictionary set
Q.26 What are the differences between local and global variables in Python?
Any variable declared inside a function is referred to as a Local variable, and its accessibility is limited to that function. Global Variable: Any variable defined outside the function is referred to be a global variable, and it can be accessed by any function in the programme.
Q.27 In Python, what is type conversion?
Type conversion is a feature provided by Python that allows you to convert one data type into the desired one. Type Conversion is divided into three categories: 1.Implicit Sort Conversion: In this type of type conversion, the Python interpreter aids in the automatic conversion of one data type to another without the need for user intervention. 2.Explicit Type Conversion: In this type of Type conversion, the user changes the data type to a needed type.
Q.28 What are Python packages and how can I use them?
A Python package is a collection of sub-packages and modules that are related to each other in terms of function.
Q.29 What are Python decorators?
Decorators are essential functions in Python that allow you to add functionality to an existing function without affecting its structure. These are called in a bottom-up style and are represented by @decorator name in Python.
Q.30 Is the case of python important?
Python is a case-sensitive programming language. This means that, like SQL and Pascal, function and function are distinct in Python.
Q.31 What is the function of [::-1]?
[::-1] is an example of slice notation that can be used to reverse the sequence using indexing. [Start,stop,step count]
Q.32 Is it necessary to use indentation in Python?
Indentation is required in Python and is part of the language's syntax. Every programming language has a mechanism for specifying the scope and extension of a code block. It's called indentation in Python. Indentation improves the readability of the code, which is probably why Python requires it.
Q.33 How to comment with multiple lines in Python?
In Python, all lines must be prefixed with # to create a multiple-line remark.
Q.34 Python is a programming language. Is it better to programme or script?
Python is a general-purpose programming language that may also be used to execute scripting.
Q.35 What are negative indices, and why do we utilise them?
We just utilise the index of the element, which is the position number of that particular element, to get an element from an ordered sequence. The index normally starts at zero, thus the first element has index zero, the second has index one, and so on.
Q.36 Explain the split(), sub(), and subn() methods of the Python "re" module.
These methods are used to change strings and are part of the Python RegEx ‘re' module. split() is a method for converting a string into a list. sub(): This method finds a substring that fits a regex pattern, then replaces the found substring with a different string. subn(): This method is similar to sub(), but instead of returning the new string, it returns the number of replacements.
Q.37 In Python, what is a map function?
In Python, the map() function has two parameters: function and iterable. The map() method accepts a function as an input and applies it to all of the components of an iterable, which is supplied to it as a second parameter. It produces a list of results in the form of an object.
Q.38 What are the different types of generators in Python?
The function that returns an iterable set of things is referred to as a generator.
Q.39 What are iterators in Python?
These are objects that can be readily visited and iterated when necessary.
Q.40 In Python, do we need to specify variables with data types?
No. Python is a dynamically typed language, which means that the Python Interpreter detects the data type of a variable based on the value type provided to it.
Q.41 In Python, how do you write comments?
Comments are statements used by programmers to make their code more readable. # can be used to define a single comment, and docstrings can be used to comment on multiple lines (strings enclosed within triple quotes).
Q.42 Is Python capable of multiple inheritance?
Yes, unlike Java, Python offers a wide range of inheritance and usage support to its customers. Many inheritance describes a situation in which a class is instantiated from multiple parent classes. This gives users a lot of functionality and benefits.
Q.43 What are the differences between Dict and List comprehensions?
Comprehensions in Python are similar to decorators in that they assist in the creation of altered and filtered lists, dictionaries, and sets from a given list, dictionary, or set. Understanding saves a lot of time and code that would otherwise be much more complicated and time-consuming. The following instances benefit from comprehensions: Performing math operations across the full list Using conditional filtering to filter the entire list Creating a single list from many listings Taking a multi-dimensional list and flattening it
Q.44 Is Python an object-oriented language?
With the exception of access specifiers, Python follows an object-oriented programming paradigm and includes all of the essential OOPs concepts such as inheritance, polymorphism, and more. Strong encapsulation is not supported in Python (adding a private keyword before data members). It does, however, have a data hiding convention, which is to preface a data member with two underscores.
Q.45 What is the difference between pickling and unpickling?
The Python object is accepted by the Pickle module, which turns it to a string representation and saves it to a file using the dump method. Pickling is the name for this method. Unpickling, on the other hand, is the process of recovering the original Python objects from the string representation.
Q.46 What does Tkinter mean to you?
Tkinter is a built-in Python tool for creating graphical user interfaces. It is Python's standard GUI development toolkit. Tkinter is included with Python, therefore there is no need to install it separately. Importing it into your script will allow you to begin using it.
Q.47 What are the functions of the operators is, not, and in?
Special functions that take one or more values (operands) and provide a commensurate result are referred to as operators. When both operands are true, is returns the true value (example: "x" is "x"). not: based on the operands, returns the opposite of the Boolean value (for example, “1” returns “0” and vice versa). In: aids in determining whether or not an element is contained in a particular Sequence.
Q.48 Why isn't all the memory de-allocated when Python exits?
When Python quits, some Python modules, particularly those with circular references to other objects or objects referred from global namespaces, are not necessarily destroyed or de-allocated. It is not feasible to de-allocate memory that has been reserved by the C library. Python would try to de-allocate every object on exit because it has its own efficient cleanup mechanism.
Q.49 In Python, what is polymorphism?
Polymorphism refers to a code's ability to take on various forms. For example, if the parent class has a method named XYZ, the child class can have a method named XYZ with its own variables and parameters.
Q.50 What does encapsulation mean in Python?
In Python, encapsulation refers to the process of combining variables and functions into a single object or capsule. The best example of encapsulation in Python is the class.
Get Govt. Certified Take Test