1. What is the difference between AI, ML, and Deep Learning?
Answer:
-
AI (Artificial Intelligence) is a broad field that aims to create machines that can mimic human intelligence.
-
ML (Machine Learning) is a subset of AI that focuses on training algorithms to learn from data.
-
Deep Learning is a subset of ML that uses neural networks with multiple layers to model complex patterns in data.
2. What are the types of Machine Learning?
Answer:
-
Supervised Learning – Uses labeled data (e.g., regression, classification).
-
Unsupervised Learning – Works with unlabeled data (e.g., clustering, anomaly detection).
-
Reinforcement Learning – An agent learns by interacting with an environment through rewards and penalties.
3. What is Overfitting and Underfitting?
Answer:
-
Overfitting occurs when a model learns too much from training data, including noise, and performs poorly on unseen data.
-
Underfitting occurs when the model is too simple and fails to capture the patterns in the data.
4. What is Regularization in Machine Learning?
Answer:
Regularization is a technique used to prevent overfitting by adding a penalty term to the loss function. Common types include:
-
L1 (Lasso) Regularization – Shrinks some feature weights to zero.
-
L2 (Ridge) Regularization – Shrinks feature weights but does not eliminate them.
5. What is the difference between Batch Gradient Descent and Stochastic Gradient Descent?
Answer:
-
Batch Gradient Descent (BGD): Uses the entire dataset to compute gradients in each iteration, which is computationally expensive.
-
Stochastic Gradient Descent (SGD): Uses a single data point per iteration, leading to faster updates but higher variance.
6. What are Precision, Recall, and F1-score?
Answer:
-
Precision = TP / (TP + FP) → Measures correctness among positive predictions.
-
Recall = TP / (TP + FN) → Measures ability to find all positive cases.
-
F1-score = 2 × (Precision × Recall) / (Precision + Recall) → Harmonic mean of precision and recall.
7. What is the difference between Bagging and Boosting?
Answer:
-
Bagging (Bootstrap Aggregating): Trains multiple models independently and averages their results (e.g., Random Forest).
-
Boosting: Trains models sequentially, where each model corrects the errors of the previous one (e.g., AdaBoost, Gradient Boosting).
8. What is the Curse of Dimensionality?
Answer:
It refers to the problem where having too many features (dimensions) in a dataset makes it difficult for machine learning models to generalize well.
9. What are some common Activation Functions in Neural Networks?
Answer:
-
Sigmoid – Used for binary classification.
-
ReLU (Rectified Linear Unit) – Helps avoid vanishing gradient issues.
-
Tanh – Similar to sigmoid but ranges from -1 to 1.
-
Softmax – Used in multi-class classification.
10. What is the difference between CNN and RNN?
Answer:
-
CNN (Convolutional Neural Network) is used for spatial data like images.
-
RNN (Recurrent Neural Network) is used for sequential data like text and time series.
11. Write Python code to implement a simple Linear Regression model using Scikit-Learn.
from sklearn.linear_model import LinearRegression from sklearn.model_selection import train_test_split import numpy as np # Sample Data X = np.array([[1], [2], [3], [4], [5]]) y = np.array([2, 4, 6, 8, 10]) # Split Data X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Train Model model = LinearRegression() model.fit(X_train, y_train) # Predictions predictions = model.predict(X_test) print(predictions)
12. How would you implement a Decision Tree in Python?
from sklearn.tree import DecisionTreeClassifier # Sample data X = [[0, 0], [1, 1], [2, 2], [3, 3]] y = [0, 1, 1, 0] # Train Model clf = DecisionTreeClassifier() clf.fit(X, y) # Predict print(clf.predict([[1.5, 1.5]]))
13. Write a Python function to compute the sigmoid function.
import numpy as np def sigmoid(x): return 1 / (1 + np.exp(-x)) print(sigmoid(0)) # Output: 0.5
14. How can you implement K-Means clustering in Python?
from sklearn.cluster import KMeans import numpy as np # Sample Data X = np.array([[1, 2], [3, 4], [5, 6], [8, 9]]) # Train Model kmeans = KMeans(n_clusters=2, random_state=42) kmeans.fit(X) # Cluster Centers print(kmeans.cluster_centers_)
15. Write a Python function to calculate Mean Squared Error (MSE).
import numpy as np def mse(y_true, y_pred): return np.mean((y_true - y_pred) ** 2) print(mse([1, 2, 3], [1.1, 1.9, 3.2])) # Example usage
16. How do you implement a basic Neural Network using TensorFlow/Keras?
import tensorflow as tf from tensorflow import keras # Model model = keras.Sequential([ keras.layers.Dense(10, activation='relu', input_shape=(5,)), keras.layers.Dense(1, activation='sigmoid') ]) # Compile model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
17. Write a Python function to compute the Euclidean distance between two points.
import numpy as np def euclidean_distance(p1, p2): return np.sqrt(np.sum((np.array(p1) - np.array(p2)) ** 2)) print(euclidean_distance([1, 2], [4, 6])) # Output: 5.0
18. How do you evaluate a Classification model in Python?
from sklearn.metrics import accuracy_score y_true = [0, 1, 1, 0] y_pred = [0, 1, 0, 1] print("Accuracy:", accuracy_score(y_true, y_pred))
19. Write Python code to perform text preprocessing using NLTK.
import nltk from nltk.tokenize import word_tokenize from nltk.corpus import stopwords nltk.download('punkt') nltk.download('stopwords') text = "Natural Language Processing is amazing!" tokens = word_tokenize(text) filtered_tokens = [w for w in tokens if w.lower() not in stopwords.words('english')] print(filtered_tokens) # Output: ['Natural', 'Language', 'Processing', 'amazing', '!']
20. What is the difference between parametric and non-parametric models?
Answer:
-
Parametric models have a fixed number of parameters (e.g., Linear Regression, Logistic Regression).
-
Non-parametric models do not assume a fixed number of parameters and can grow in complexity with more data (e.g., Decision Trees, k-NN).
21. Implement Principal Component Analysis (PCA) in Python
PCA is used for dimensionality reduction while preserving as much variance as possible.
from sklearn.decomposition import PCA import numpy as np # Sample Data (4 samples, 3 features) X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]) # Apply PCA to reduce to 2 components pca = PCA(n_components=2) X_reduced = pca.fit_transform(X) print("Reduced Features:\n", X_reduced)
22. Writing a Custom Loss Function in TensorFlow/Keras
Custom loss functions can be useful in advanced models.
import tensorflow as tf def custom_loss(y_true, y_pred): return tf.reduce_mean(tf.square(y_true - y_pred)) # Mean Squared Error (MSE) # Example usage in a model model.compile(optimizer='adam', loss=custom_loss, metrics=['accuracy'])
23. Implement Logistic Regression from Scratch
Logistic Regression is used for binary classification.
import numpy as np class LogisticRegression: def __init__(self, lr=0.01, epochs=1000): self.lr = lr self.epochs = epochs self.weights = None self.bias = None def sigmoid(self, z): return 1 / (1 + np.exp(-z)) def fit(self, X, y): n_samples, n_features = X.shape self.weights = np.zeros(n_features) self.bias = 0 for _ in range(self.epochs): model = np.dot(X, self.weights) + self.bias predictions = self.sigmoid(model) dw = (1 / n_samples) * np.dot(X.T, (predictions - y)) db = (1 / n_samples) * np.sum(predictions - y) self.weights -= self.lr * dw self.bias -= self.lr * db def predict(self, X): model = np.dot(X, self.weights) + self.bias predictions = self.sigmoid(model) return [1 if i > 0.5 else 0 for i in predictions] # Example Usage X_train = np.array([[1, 2], [2, 3], [3, 4], [5, 6]]) y_train = np.array([0, 0, 1, 1]) model = LogisticRegression(lr=0.1, epochs=1000) model.fit(X_train, y_train) preds = model.predict(X_train) print("Predictions:", preds)
24. Understanding Transformer Architecture
The Transformer is the backbone of models like GPT and BERT. Key components include:
-
Self-Attention: Helps each word attend to all other words.
-
Positional Encoding: Since transformers do not process text sequentially, they need position information.
-
Multi-Head Attention: Uses multiple attention heads to focus on different parts of the input.
Here is a simplified PyTorch implementation of Self-Attention:
import torch import torch.nn.functional as F def self_attention(Q, K, V): scores = torch.matmul(Q, K.T) / torch.sqrt(torch.tensor(K.shape[1], dtype=torch.float32)) attention_weights = F.softmax(scores, dim=-1) return torch.matmul(attention_weights, V) Q = torch.tensor([[1.0, 0.5], [0.3, 0.8]]) K = torch.tensor([[1.0, 0.2], [0.4, 0.9]]) V = torch.tensor([[0.5, 1.0], [0.7, 0.3]]) output = self_attention(Q, K, V) print("Self-Attention Output:\n", output)
25. Using PyTorch for Deep Learning
PyTorch is widely used for deep learning tasks.
import torch import torch.nn as nn import torch.optim as optim # Simple Neural Network with PyTorch class SimpleNN(nn.Module): def __init__(self): super(SimpleNN, self).__init__() self.fc1 = nn.Linear(2, 5) self.fc2 = nn.Linear(5, 1) def forward(self, x): x = torch.relu(self.fc1(x)) x = torch.sigmoid(self.fc2(x)) return x # Create model and optimizer model = SimpleNN() optimizer = optim.Adam(model.parameters(), lr=0.01) criterion = nn.BCELoss() # Sample Data X = torch.tensor([[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]], dtype=torch.float32) y = torch.tensor([[0.0], [1.0], [1.0]], dtype=torch.float32) # Training Step for epoch in range(100): optimizer.zero_grad() outputs = model(X) loss = criterion(outputs, y) loss.backward() optimizer.step() print("Final Predictions:\n", model(X).detach().numpy())