Deep Learning with Caffe2

Caffe2 is a deep learning framework made with expression, speed, and modularity in mind. We have list down some interview questions that can help you to prepare for a role in data science.

Q.1 Compare Caffe to Caffe2?
Original Caffe framework is considered useful for large-scale product use cases, especially with its unparalleled performance and well tested C++ codebase. Also Caffe has some design choices that are inherited from its original use case - conventional CNN applications. Also a new computation patterns have emerged, especially distributed computation, mobile, reduced precision computation, and more non-vision use cases, its design has shown some limitations. Following are the points that describe how Caffe2 improves Caffe 1.0 -
1. first-class support for large-scale distributed training
2. Mobile deployment
3. New hardware support (in addition to CPU and CUDA)
4. Flexibility for future directions such as quantized computation
5. Stress tested by the vast scale of Facebook applications
Q.2 What are the new features in Caffe2?
The primary reason of computation in Caffe2 are the Operators such that these as a more flexible version of the layers from Caffe. Also Caffe2 comes with over 400 different operators and provides guidance for the community to create and contribute to this growing resource.
Q.3 Differentiate between Caffe2 and PyTorch?
Some of the point of difference are -
1. Caffe2 has been built to excel at mobile and at large scale deployments which is new in Caffe2 to support multi-GPU, and bringing Torch and Caffe2 together with the same level of GPU support.
2. Caffe2 has been built to excel at utilizing both multiple GPUs on a single-host and multiple hosts with GPUs. On the other hand PyTorch is great for research, experimentation and trying out exotic neural networks
3. Caffe2 is headed towards supporting more industrial-strength applications with a heavy focus on mobile. On the other hand PyTorch doesn’t do mobile or doesn’t scale or that you can’t use Caffe2 with some awesome new paradigm of neural network.
Q.4 Why should we use Caffe?
Following are the popular features of Cafe -
1. Expressive architecture that encourages application and innovation. Such that models and optimization are defined by configuration without hard-coding. Switch between CPU and GPU by setting a single flag to train on a GPU machine then deploy to commodity clusters or mobile devices.
2. Extensible code fosters active development. In Caffe’s first year, it has been forked by over 1,000 developers and had many significant changes contributed back. Thanks to these contributors the framework tracks the state-of-the-art in both code and models.
3. Speed makes Caffe perfect for research experiments and industry deployment. Caffe can process over 60M images per day with a single NVIDIA K40 GPU*. That’s 1 ms/image for inference and 4 ms/image for learning and more recent library versions and hardware are faster still. We believe that Caffe is among the fastest convnet implementations available.
4. Community: Caffe already powers academic research projects, startup prototypes, and even large-scale industrial applications in vision, speech, and multimedia. Join our community of brewers on the caffe-users group and Github.
Q.5 Name the heterogeneous computing architectures are currently supported by Caffe?
The heterogeneous computing architectures are currently supported by Caffe -
1. GPUs
2. FPGAs
3. Dedicated CNN processors
Q.6 What is Caffe?
CAFFE abbreviated as Convolutional Architecture for Fast Feature Embedding is a deep learning framework, that was originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface.
Q.7 Explain the difference between Machine Learning and Deep Learning.
In a machine learning algorithm, the selection of features in the dataset plays an extremely important role in getting the desired prediction accuracy. In traditional machine learning techniques, feature selection is done mostly by human inspection, judgment, and deep domain knowledge. And, in deep learning algorithms, feature engineering is done automatically. Generally, feature engineering is time-consuming and requires good expertise in the domain. For implementing the automatic feature extraction, the deep learning algorithms typically ask for a huge amount of data, so if you have only thousands and tens of thousands of data points, the deep learning technique may fail to give you satisfactory results. With larger data, the deep learning algorithms produce better results compared to traditional ML algorithms with an added advantage of less or no feature engineering.
Q.8 What are Caffe2 Operators?
In Caffe2, the Operator is the basic unit of computation. However, Caffe2 provides an exhaustive list of operators. This includes the operator called FC, which computes the result of passing an input vector X into a fully connected network with a two-dimensional weight matrix W and a single-dimensional bias vector b. In other words, it computes the following mathematical equation Y = X * W^T + b Where X has dimensions (M x k), W has dimensions (n x k) and b is (1 x n). The output Y will be of dimension (M x n), where M is the batch size. Further, for the vectors X and W, we will use the GaussianFill operator to create some random data. And, for generating bias values b, we will use ConstantFill operator.
Q.9 How to create a network in Caffe2?
Firstly, import the required packages − from caffe2.python import core, workspace After that, define the network by calling core.Net as follows − net = core.Net("SingleLayerFC") The name of the network is specified as SingleLayerFC. At this point, the network object called the net is created.
Q.10 Name the methods of Image Processing.
Image processing consists of two steps. 1. Image Resizing 2. Image Cropping
Q.11 Explain the process of Image Resizing.
In this, firstly, we will write a function for resizing the image. Here, we will resize the image to 227x227. The function resize can be defined as: def resize(img, input_height, input_width): Now, we obtain the aspect ratio of the image by dividing the width by the height. original_aspect = img.shape[1]/float(img.shape[0]) However, if the aspect ratio is greater than 1, it indicates that the image is wide, that to say it is in the landscape mode. For adjusting the image height and return the resized image use the following code: if(original_aspect>1): new_height = int(original_aspect * input_height) return skimage.transform.resize(img, (input_width, new_height), mode='constant', anti_aliasing=True, anti_aliasing_sigma=None) Further, if the aspect ratio is less than 1, it indicates the portrait mode. For adjusting the width use the following code: if(original_aspect<1): new_width = int(input_width/original_aspect) return skimage.transform.resize(img, (new_width, input_height), mode='constant', anti_aliasing=True, anti_aliasing_sigma=None) Lastly, if the aspect ratio equals 1, we do not make any height/width adjustments. if(original_aspect == 1): return skimage.transform.resize(img, (input_width, input_height), mode='constant', anti_aliasing=True, anti_aliasing_sigma=None)
Q.12 Explain the process of Image Cropping.
Firstly, declare the crop_image function as follows: def crop_image(img,cropx,cropy): After that, extract the dimensions of the image using the following statement: y,x,c = img.shape Then, create a new starting point for the image using the following two lines of code − startx = x//2-(cropx//2) starty = y//2-(cropy//2) Lastly, return the cropped image by creating an image object with the new dimensions: return img[starty:starty+cropy,startx:startx+cropx]
Q.13 Define the term NumPy.
NumPy refers to a Python library allowing easy numerical calculations involving single and multidimensional arrays and matrices. This excels in performing numerical calculations. Many data science libraries like Pandas, Scikit-learn, SciPy, matplotlib, etc. depend on NumPy. It forms an integral part of today’s data science applications written in Python. Further, this provides: Firstly, a powerful N-dimensional array object called as ndarray Secondly, broadcasting functions Thirdly, tools for integrating C/C++ and Fortran code Lastly, useful linear algebra, Fourier transform, and random number capabilities
Q.14 How to create a matrix in NumPy?
Creating a matrix using lists: 1 2 3 4 5 6 Syntax: ## Import numpy import numpy as np ## Create a 2D numpy array using python lists arr = np.array([[ 1, 2, 3],[ 4, 5, 6]]) print(arr) Here, np.array is used to create NumPy array from a list. NumPy arrays are of type ndarray. Further, the output of the above program is: It represents a 2D matrix where input to np.array() is a list of lists [[ 1, 2, 3],[ 4, 5, 6]] . Each list in the parent list forms a row in the matrix.
Q.15 What is a Deep Neural Network?
A deep neural network represents the type of machine learning when the system uses many layers of nodes for deriving high-level functions from input information. It means converting the data into a more creative and abstract component.
Q.16 What is Convolutional Neural Network?
A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm that can take in an input image, assign importance to various aspects/objects in the image and be able to differentiate one from the other. The pre-processing required in a ConvNet is much lower as compared to other classification algorithms. While in primitive methods filters are hand-engineered, with enough training, ConvNets have the ability to learn these filters/characteristics.
Q.17 Explain the process of training a CNN.
The process for training a CNN for classifying images consists of the following steps − 1. Data Preparation In this step, we center-crop the images and resize them so that all images for training and testing would be of the same size. This is usually done by running a small Python script on the image data. 2. Model Definition In this step, we define a CNN architecture. The configuration is stored in .pb (protobuf) file. 3. Solver Definition In this, we define the solver configuration file. The solver does the model optimization. 4. Model Training In this, we use the built-in Caffe utility to train the model. The training may take a considerable amount of time and CPU usage. After the training is completed, Caffe stores the model in a file, which can, later on, be used on test data and final deployment for predictions.
Q.18 Define an activation function.
The activation function refers to the non-linear function that we apply over the output data coming out of a particular layer of neurons before it propagates as the input to the next layer.
Q.19 What do you understand by the term Sigmoid?
The sigmoid function is one of the nonlinear activation functions for deep learning that takes a real-valued number as an input and compresses all its outputs to the range of [0,1. There are many functions with the characteristic of an “S” shaped curve known as sigmoid functions. The most commonly used function is the logistic function.
Q.20 What is ReLU?
ReLU stands for Rectified Linear Unit that is the non-linear activation function for deep learning which was first popularized in the context of a convolution neural network (CNN). If the input is positive then the function would output the value itself, if the input is negative the output would be zero. Further, the process of ReLu function evaluation is computationally efficient as it does not involve computing exp(x), and therefore, in practice, it converges much faster than logistic/Tanh for the same performance. That’s why, ReLU has become a de-facto standard for large convolutional neural network architectures such as Inception, ResNet, MobileNet, VGGNet, etc.
Q.21 What are Recurrent neural networks?
Recurrent neural networks are one of the staples of deep learning, enabling neural networks to work with sequences of data like text, audio, and video. They can be used for boiling a sequence down into a high-level understanding, annotating sequences, and even generating new sequences from scratch. However, the basic RNN design struggles with longer sequences, but a special variant long short-term memory” networks can even work with these. Such models have been found to be very powerful, achieving remarkable results in many tasks including translation, voice recognition, and image captioning. As a result, recurrent neural networks have become very widespread in the last few years.
Q.22 What is word embedding in NLP?
In natural language processing (NLP), word embedding is a term used for the representation of words for text analysis, typically in the form of a real-valued vector that encodes the meaning of the word such that the words that are closer in the vector space are expected to be similar in meaning. This can be obtained using a set of language modeling and feature learning techniques where words or phrases from the vocabulary are mapped to vectors of real numbers. Conceptually it involves the mathematical embedding from space with many dimensions per word to a continuous vector space with a much lower dimension.
Q.23 Explain the different types of Recurrent neural networks (RNN).

The core reason that recurrent nets are more exciting is that they allow us to operate over sequences of vectors: Sequences in the input, the output, or in the most general case both. Some of the example types include:

1. One-to-one This is also known Plain/Vaniall Neural network. It deals with the Fixed size of the input to the Fixed size of Output where they are independent of previous information/output. For example, Image classification.

2. One-to-Many This deals with a fixed size of information as input that gives a sequence of data as output. For example, Image Captioning takes an image as input and outputs a sentence of words.

3. Many-to-One It takes a Sequence of information as input and outputs a fixed size of the output. For example, sentiment analysis where a given sentence is classified as expressing positive or negative sentiment.

4. Many-to-Many It takes a Sequence of information as input and processes it recurrently outputs a Sequence of data. For example, Machine Translation, where an RNN reads a sentence in English and then outputs a sentence in French.

Q.24 What is Long Short Term Memory (LSTM)? Explain its process.

LSTM’s have a Nature of Remembering information for long periods of time is their Default behavior. The LSTM had a three-step Process:

1. Forget Gate This gate Decides which information is to be omitted from the cell in that particular timestamp. It is decided by the sigmoid function. However, it looks at the previous state(ht-1) and the content input(Xt) and outputs a number between 0(omit this)and 1(keep this)for each number in the cell state Ct−1.

2. Update Gate/input gate Decides how much of this unit is added to the current state. In this, the Sigmoid function decides which values to let through 0,1. and tanh function gives weightage to the values which are passed deciding their level of importance ranging from-1 to 1.

3. Output Gate Decides which part of the current cell makes it to the output. In this, the Sigmoid function decides which values to let through 0,1. and tanh function gives weightage to the values which are passed deciding their level of importance ranging from-1 to 1 and multiplied with an output of Sigmoid.

Q.25 What is sequence learning?
Sequence learning is an integrated part of conscious and nonconscious learning as well as activities. Sequences of information or sequences of actions are used in various everyday tasks: from sequencing sounds in a speech to sequencing movements in typing or playing instruments to sequencing actions in driving an automobile. Further, it can be used to study skill acquisition and in studies of various groups ranging from neuropsychological patients to infants. However, sequence learning can also be referred to as sequential behavior, behavior sequencing, and serial order in behavior.
Q.26 Explain the term regularization.
Regularization is a method that makes slight modifications to the learning algorithm such that the model generalizes better. This in turn improves the model’s performance on the unseen data as well. 20. Name some of the regularization techniques. The techniques are as follows: 1. L2 and L1 Regularization 2. Dropout 3. Early Stopping 4. Data Augmentation
Q.27 Explain the L2 and L1 Regularization techniques.
L2 and L1 are the most common types of regularization. Regularization works on the premise that smaller weights lead to simpler models which result helps in avoiding overfitting. So to obtain a smaller weight matrix, these techniques add a ‘regularization term’ along with the loss to obtain the cost function. Here, Cost function = Loss + Regularization term However, the difference between L1 and L2 regularization techniques lies in the nature of this regularization term. In general, the addition of this regularization term causes the values of the weight matrices to reduce, leading to simpler models.
Q.28 What do you understand about Dropout and early stopping techniques?
Dropout means that during the training, randomly selected neurons are turned off or ‘dropped’ out. It means that they are temporarily obstructed from influencing or activating the downward neuron in a forward pass, and none of the weights updates is applied on the backward pass. Whereas Early Stopping is a kind of cross-validation strategy where one part of the training set is used as a validation set, and the performance of the model is gauged against this set. So if the performance on this validation set gets worse, the training on the model is immediately stopped. However, the main idea behind this technique is that while fitting a neural network on training data, consecutively, the model is evaluated on the unseen data or the validation set after each iteration. So if the performance on this validation set is decreasing or remaining the same for certain iterations, then the process of model training is stopped.
Q.29 What is Data Augmentation?
This states that a simple way to reduce overfitting is to increase the data, and this technique helps in doing so. However, Data augmentation is a regularization technique, which is used generally when we have images as data sets. It creates additional data artificially from the existing training data by making minor changes such as rotation, flipping, cropping, or blurring a few pixels in the image, and this process generates more and more data. Through this regularization technique, the model variance is reduced, which in turn decreases the regularization error.
Q.30 What do you know about Model Zoo?
Model Zoo can be considered as a machine learning model deployment platform with a focus on ease of use. Deploy your model to an HTTP endpoint with a single line of code: from modelzoo.tensorflow import deploy, predict # Train or load your TensorFlow here. model = train_model() # Deploy with one function. model_name = deploy(model) # Make predictions from Python. predictions = predict(model_name, image="test.jpg")
Q.31 What are the features of the Model Zoo?
Model Zoo offers single function deployment so, there is no need to write any code or learn new technologies. Secondly, it provides real-time monitoring of model features and predictions. Thirdly, this performs autoscaling down to zero during periods of low activity to save costs, and up to accommodate bursts of demand. Next, it contains auto-generated documentation of the model inputs and outputs. Model Zoo has an in-built web interface for testing and sharing models. Lastly, it has a Python client library for making predictions.
Q.32 Define supervised learning.
Supervised learning or supervised machine learning refers to a subcategory of machine learning and artificial intelligence. It is defined by its use of labeled datasets for training algorithms for classifying data or predicting outcomes accurately. As input data is fed into the model, it adjusts its weights until the model has been fitted appropriately, which occurs as part of the cross-validation process. Supervised learning helps organizations solve a variety of real-world problems at scale, such as classifying spam in a separate folder from your inbox.
Q.33 What do you know about K-nearest neighbor?
K-nearest neighbor (KNN algorithm) refers to a non-parametric algorithm that classifies data points based on their proximity and association to other available data. This algorithm assumes that similar data points can be found near each other. As a result, it seeks in calculating the distance between data points through Euclidean distance, and then it allocated a category based on the most frequent category or average. Further, KNN is typically used for recommendation engines and image recognition.
Q.34 What is transfer learning (TL)?
Transfer learning (TL) can be defined as a research problem in machine learning (ML) that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks. This area of research supports some relation to the long history of psychological literature on the transfer of learning, although formal ties between the two fields are limited. Further, reusing or transferring information from previously learned tasks for the learning of new tasks has the potential to significantly improve the sample efficiency of a reinforcement learning agent.
Q.35 Explain what is Deep reinforcement learning?
Deep reinforcement learning (deep RL) can be defined as a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, enabling agents to make decisions from unstructured input data without manual engineering of the state space. Moreover, the Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance, and healthcare.
Q.36 Explain the term, Naive Bayes.
Naive Bayes can be considered as a classification approach that adopts the principle of class conditional independence from the Bayes Theorem. This means that the presence of one feature does not impact the presence of another in the probability of a given outcome, and each predictor has an equal effect on that result. Further, there are three types of Naïve Bayes classifiers: Multinomial Naïve Bayes Bernoulli Naïve Bayes Gaussian Naïve Bayes.
Get Govt. Certified Take Test