Deep Learning with Caffe2 Interview Questions

Deep Learning with Caffe2

Caffe2 is a deep learning framework made with expression, speed, and modularity in mind. We have list down some interview questions that can help you to prepare for a role in data science.

Q.58 Compare Caffe to Caffe2?

Original Caffe framework is considered useful for large-scale product use cases, especially with its unparalleled performance and well tested C++ codebase. Also Caffe has some design choices that are inherited from its original use case - conventional CNN applications. Also a new computation patterns have emerged, especially distributed computation, mobile, reduced precision computation, and more non-vision use cases, its design has shown some limitations. Following are the points that describe how Caffe2 improves Caffe 1.0 -
1. first-class support for large-scale distributed training
2. Mobile deployment
3. New hardware support (in addition to CPU and CUDA)
4. Flexibility for future directions such as quantized computation
5. Stress tested by the vast scale of Facebook applications

Q.60 Differentiate between Caffe2 and PyTorch?

Some of the point of difference are -
1. Caffe2 has been built to excel at mobile and at large scale deployments which is new in Caffe2 to support multi-GPU, and bringing Torch and Caffe2 together with the same level of GPU support.
2. Caffe2 has been built to excel at utilizing both multiple GPUs on a single-host and multiple hosts with GPUs. On the other hand PyTorch is great for research, experimentation and trying out exotic neural networks
3. Caffe2 is headed towards supporting more industrial-strength applications with a heavy focus on mobile. On the other hand PyTorch doesn’t do mobile or doesn’t scale or that you can’t use Caffe2 with some awesome new paradigm of neural network.

Q.61 Why should we use Caffe?

Following are the popular features of Cafe -
1. Expressive architecture that encourages application and innovation. Such that models and optimization are defined by configuration without hard-coding. Switch between CPU and GPU by setting a single flag to train on a GPU machine then deploy to commodity clusters or mobile devices.
2. Extensible code fosters active development. In Caffe’s first year, it has been forked by over 1,000 developers and had many significant changes contributed back. Thanks to these contributors the framework tracks the state-of-the-art in both code and models.
3. Speed makes Caffe perfect for research experiments and industry deployment. Caffe can process over 60M images per day with a single NVIDIA K40 GPU*. That’s 1 ms/image for inference and 4 ms/image for learning and more recent library versions and hardware are faster still. We believe that Caffe is among the fastest convnet implementations available.
4. Community: Caffe already powers academic research projects, startup prototypes, and even large-scale industrial applications in vision, speech, and multimedia. Join our community of brewers on the caffe-users group and Github.

Q.64 Explain the difference between Machine Learning and Deep Learning.

In a machine learning algorithm, the selection of features in the dataset plays an extremely important role in getting the desired prediction accuracy. In traditional machine learning techniques, feature selection is done mostly by human inspection, judgment, and deep domain knowledge. And, in deep learning algorithms, feature engineering is done automatically. Generally, feature engineering is time-consuming and requires good expertise in the domain. For implementing the automatic feature extraction, the deep learning algorithms typically ask for a huge amount of data, so if you have only thousands and tens of thousands of data points, the deep learning technique may fail to give you satisfactory results. With larger data, the deep learning algorithms produce better results compared to traditional ML algorithms with an added advantage of less or no feature engineering.

Q.65 What are Caffe2 Operators?

In Caffe2, the Operator is the basic unit of computation. However, Caffe2 provides an exhaustive list of operators. This includes the operator called FC, which computes the result of passing an input vector X into a fully connected network with a two-dimensional weight matrix W and a single-dimensional bias vector b. In other words, it computes the following mathematical equation Y = X * W^T + b Where X has dimensions (M x k), W has dimensions (n x k) and b is (1 x n). The output Y will be of dimension (M x n), where M is the batch size. Further, for the vectors X and W, we will use the GaussianFill operator to create some random data. And, for generating bias values b, we will use ConstantFill operator.

Q.68 Explain the process of Image Resizing.

In this, firstly, we will write a function for resizing the image. Here, we will resize the image to 227x227. The function resize can be defined as: def resize(img, input_height, input_width): Now, we obtain the aspect ratio of the image by dividing the width by the height. original_aspect = img.shape[1]/float(img.shape[0]) However, if the aspect ratio is greater than 1, it indicates that the image is wide, that to say it is in the landscape mode. For adjusting the image height and return the resized image use the following code: if(original_aspect>1): new_height = int(original_aspect * input_height) return skimage.transform.resize(img, (input_width, new_height), mode='constant', anti_aliasing=True, anti_aliasing_sigma=None) Further, if the aspect ratio is less than 1, it indicates the portrait mode. For adjusting the width use the following code: if(original_aspect<1): new_width = int(input_width/original_aspect) return skimage.transform.resize(img, (new_width, input_height), mode='constant', anti_aliasing=True, anti_aliasing_sigma=None) Lastly, if the aspect ratio equals 1, we do not make any height/width adjustments. if(original_aspect == 1): return skimage.transform.resize(img, (input_width, input_height), mode='constant', anti_aliasing=True, anti_aliasing_sigma=None)

Q.74 Explain the process of training a CNN.

The process for training a CNN for classifying images consists of the following steps − 1. Data Preparation In this step, we center-crop the images and resize them so that all images for training and testing would be of the same size. This is usually done by running a small Python script on the image data. 2. Model Definition In this step, we define a CNN architecture. The configuration is stored in .pb (protobuf) file. 3. Solver Definition In this, we define the solver configuration file. The solver does the model optimization. 4. Model Training In this, we use the built-in Caffe utility to train the model. The training may take a considerable amount of time and CPU usage. After the training is completed, Caffe stores the model in a file, which can, later on, be used on test data and final deployment for predictions.

Q.77 What is ReLU?

ReLU stands for Rectified Linear Unit that is the non-linear activation function for deep learning which was first popularized in the context of a convolution neural network (CNN). If the input is positive then the function would output the value itself, if the input is negative the output would be zero. Further, the process of ReLu function evaluation is computationally efficient as it does not involve computing exp(x), and therefore, in practice, it converges much faster than logistic/Tanh for the same performance. That’s why, ReLU has become a de-facto standard for large convolutional neural network architectures such as Inception, ResNet, MobileNet, VGGNet, etc.

Q.78 What are Recurrent neural networks?

Recurrent neural networks are one of the staples of deep learning, enabling neural networks to work with sequences of data like text, audio, and video. They can be used for boiling a sequence down into a high-level understanding, annotating sequences, and even generating new sequences from scratch. However, the basic RNN design struggles with longer sequences, but a special variant long short-term memory” networks can even work with these. Such models have been found to be very powerful, achieving remarkable results in many tasks including translation, voice recognition, and image captioning. As a result, recurrent neural networks have become very widespread in the last few years.

Q.80 Explain the different types of Recurrent neural networks (RNN).

The core reason that recurrent nets are more exciting is that they allow us to operate over sequences of vectors: Sequences in the input, the output, or in the most general case both. Some of the example types include:

1. One-to-one This is also known Plain/Vaniall Neural network. It deals with the Fixed size of the input to the Fixed size of Output where they are independent of previous information/output. For example, Image classification.

2. One-to-Many This deals with a fixed size of information as input that gives a sequence of data as output. For example, Image Captioning takes an image as input and outputs a sentence of words.

3. Many-to-One It takes a Sequence of information as input and outputs a fixed size of the output. For example, sentiment analysis where a given sentence is classified as expressing positive or negative sentiment.

4. Many-to-Many It takes a Sequence of information as input and processes it recurrently outputs a Sequence of data. For example, Machine Translation, where an RNN reads a sentence in English and then outputs a sentence in French.

Q.81 What is Long Short Term Memory (LSTM)? Explain its process.

LSTM’s have a Nature of Remembering information for long periods of time is their Default behavior. The LSTM had a three-step Process:

1. Forget Gate This gate Decides which information is to be omitted from the cell in that particular timestamp. It is decided by the sigmoid function. However, it looks at the previous state(ht-1) and the content input(Xt) and outputs a number between 0(omit this)and 1(keep this)for each number in the cell state Ct−1.

2. Update Gate/input gate Decides how much of this unit is added to the current state. In this, the Sigmoid function decides which values to let through 0,1. and tanh function gives weightage to the values which are passed deciding their level of importance ranging from-1 to 1.

3. Output Gate Decides which part of the current cell makes it to the output. In this, the Sigmoid function decides which values to let through 0,1. and tanh function gives weightage to the values which are passed deciding their level of importance ranging from-1 to 1 and multiplied with an output of Sigmoid.

Q.84 Explain the L2 and L1 Regularization techniques.

L2 and L1 are the most common types of regularization. Regularization works on the premise that smaller weights lead to simpler models which result helps in avoiding overfitting. So to obtain a smaller weight matrix, these techniques add a ‘regularization term’ along with the loss to obtain the cost function. Here, Cost function = Loss + Regularization term However, the difference between L1 and L2 regularization techniques lies in the nature of this regularization term. In general, the addition of this regularization term causes the values of the weight matrices to reduce, leading to simpler models.

Q.85 What do you understand about Dropout and early stopping techniques?

Dropout means that during the training, randomly selected neurons are turned off or ‘dropped’ out. It means that they are temporarily obstructed from influencing or activating the downward neuron in a forward pass, and none of the weights updates is applied on the backward pass. Whereas Early Stopping is a kind of cross-validation strategy where one part of the training set is used as a validation set, and the performance of the model is gauged against this set. So if the performance on this validation set gets worse, the training on the model is immediately stopped. However, the main idea behind this technique is that while fitting a neural network on training data, consecutively, the model is evaluated on the unseen data or the validation set after each iteration. So if the performance on this validation set is decreasing or remaining the same for certain iterations, then the process of model training is stopped.

Q.91 What is transfer learning (TL)?

Transfer learning (TL) can be defined as a research problem in machine learning (ML) that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks. This area of research supports some relation to the long history of psychological literature on the transfer of learning, although formal ties between the two fields are limited. Further, reusing or transferring information from previously learned tasks for the learning of new tasks has the potential to significantly improve the sample efficiency of a reinforcement learning agent.

Q.92 Explain what is Deep reinforcement learning?

Deep reinforcement learning (deep RL) can be defined as a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, enabling agents to make decisions from unstructured input data without manual engineering of the state space. Moreover, the Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance, and healthcare.

Deep Learning with Caffe2

Get Govt. Certified

Are you an expert ?