66 Commits
1.0 ... master

Author SHA1 Message Date
Aladdin Persson
558557c798 update readme with lightning 2023-03-21 12:44:08 +01:00
Aladdin Persson
8f559d3303 update huggingface code 2023-03-21 12:08:53 +01:00
Aladdin Persson
e4659fe56a huggingface update 2023-03-18 09:51:16 +01:00
Aladdin Persson
94f6c024fe add lightning code, finetuning whisper, recommender system neural collaborative filtering 2023-02-21 16:25:42 +01:00
Aladdin Persson
c646ef65e2 checked GAN code 2022-12-21 14:03:08 +01:00
Aladdin Persson
b6985eccc9 updated and checked CNN architectures still works with latest pytorch 2022-12-20 12:13:12 +01:00
Aladdin Persson
28a6abea27 update readme 2022-12-19 23:56:55 +01:00
Aladdin Persson
3a4d3ae807 update imbalanced dataset comment 2022-12-19 23:54:00 +01:00
Aladdin Persson
0261a682ca update readme 2022-12-19 23:46:11 +01:00
Aladdin Persson
41cce4d160 updated readme with mixed precision tutorial 2022-12-19 23:42:32 +01:00
Aladdin Persson
cd607c395c updated basic tutorials, better comments, code revision, checked it works with latest pytorch version 2022-12-19 23:39:48 +01:00
Aladdin Persson
3f53d68c4f update pretrain progress bar tutorial 2022-12-19 16:29:48 +01:00
Aladdin Persson
058742e581 update mixed precision with comments 2022-12-19 16:17:47 +01:00
Aladdin Persson
8f12620cef update lr scheduler and precision 2022-12-19 16:13:53 +01:00
Aladdin Persson
cc0df999e2 reran and refined old tutorials 2022-12-19 15:57:59 +01:00
dino
088bdb63e9 added_mcts_and_metrics 2022-09-29 11:18:12 +02:00
dino
6c792599cf fullynet code review and update with small improvement 2022-09-23 10:57:47 +02:00
dino
ae581a64e6 vae 2022-09-13 13:04:49 +02:00
Aladdin Persson
ac5dcd03a4 cleaned code 2021-06-05 13:28:56 +02:00
Aladdin Persson
c19bb8675e added kaggle comp solution for facial keypoint 2021-06-05 13:26:29 +02:00
Aladdin Persson
8136ee169f DR kaggle 2021-05-30 16:24:52 +02:00
Aladdin Persson
9675f0d6af add imbalanced classes video code and kaggle cat vs dog 2021-05-27 10:21:14 +02:00
Aladdin Persson
e06671856c stylegan, esrgan, srgan code 2021-05-15 15:03:33 +02:00
Aladdin Persson
3b415a1a3d stylegan, esrgan, srgan code 2021-05-15 15:02:56 +02:00
Aladdin Persson
5f8f410b6e stylegan, esrgan, srgan code 2021-05-15 14:59:43 +02:00
Aladdin Persson
5033cbb567 stylegan, esrgan, srgan code 2021-05-15 14:58:41 +02:00
Aladdin Persson
a2ee9271b5 Merge branch 'master' of https://github.com/aladdinpersson/Machine-Learning-Collection 2021-03-26 14:43:25 +01:00
Aladdin Persson
cb19107179 added link to ms coco yolov3 2021-03-26 14:43:07 +01:00
Aladdin Persson
37762890f0 updated readme with new links 2021-03-24 22:18:05 +01:00
Aladdin Persson
d01c4b98af revisions to code examples 2021-03-24 22:12:45 +01:00
Aladdin Persson
b1e2379528 test 2021-03-24 22:09:25 +01:00
Aladdin Persson
80690d56f8 small revisions to code examples 2021-03-24 22:07:20 +01:00
Aladdin Persson
31c404822a small revisions to code examples 2021-03-24 22:03:12 +01:00
Aladdin Persson
d945e7ae47 update 2021-03-24 22:01:16 +01:00
Aladdin Persson
d8f3bb6123 Merge branch 'master' of https://github.com/aladdinpersson/Machine-Learning-Collection 2021-03-24 21:59:28 +01:00
Aladdin Persson
42a8161013 progan readme update! 2021-03-24 21:58:17 +01:00
Aladdin Persson
e6c7f42c46 revised some code examples 2021-03-24 21:57:40 +01:00
Aladdin Persson
2a9c539b40 fix progan download link 2021-03-24 21:14:47 +01:00
Aladdin Persson
dbab15d0fd update progan 2021-03-24 21:12:13 +01:00
Aladdin Persson
3d615bca61 update progan readme linkW 2021-03-24 21:09:02 +01:00
Aladdin Persson
1b761f1443 Merge pull request #29 from darveenvijayan/patch-1
Update pytorch_inceptionet.py
2021-03-24 13:04:45 +01:00
Aladdin Persson
83d234daec Merge pull request #33 from ankandrew/patch-1
Added bias=False
2021-03-24 13:03:52 +01:00
Aladdin Persson
74597aa8fd updated progan 2021-03-24 13:01:45 +01:00
Aladdin Persson
59b1de7bfe updated progan 2021-03-21 12:19:18 +01:00
Aladdin Persson
c72d1d6a31 damn, copied over wrong train file for ProGAN (will check this more thoroughly before the video is up too 2021-03-19 20:21:14 +01:00
Aladdin Persson
bd6db84daa update funding to github instead of patreon 2021-03-18 00:10:23 +01:00
Aladdin Persson
6a7fa8e853 update to progan 2021-03-17 22:49:31 +01:00
Aladdin Persson
9b6f6cfa18 update to progan 2021-03-17 22:48:50 +01:00
Aladdin Persson
06bd8204b3 update to progan 2021-03-17 22:48:26 +01:00
ankandrew
40d9b0432d Added bias=False
Bias term already included in the BN layers; can be set to False as it is redundant
2021-03-12 15:11:39 -03:00
Aladdin Persson
dc7f4f4ee7 progan cyclegan 2021-03-11 15:58:01 +01:00
Aladdin Persson
707457059e progan cyclegan 2021-03-11 15:56:59 +01:00
Aladdin Persson
8cbaf3ebc3 progan cyclegan 2021-03-11 15:55:27 +01:00
Aladdin Persson
c67e7f88a6 merge branch 'master' of https://github.com/aladdinpersson/Machine-Learning-Collection 2021-03-11 15:51:33 +01:00
Aladdin Persson
2c53205f12 cyclegan, progan 2021-03-11 15:50:44 +01:00
Aladdin Persson
0506a2a878 Update README.md 2021-03-08 23:57:45 +01:00
Aladdin Persson
91b1fd156c cyclegan 2021-03-06 21:09:41 +01:00
Aladdin Persson
2a397b17e2 cyclegan 2021-03-06 21:09:08 +01:00
Aladdin Persson
00ea9fea1f update readmes, added pix2pix 2021-03-06 12:54:46 +01:00
Aladdin Persson
946465e63c update readmes, added pix2pix 2021-03-06 12:52:33 +01:00
Aladdin Persson
65a51c6e64 update readmes, added pix2pix 2021-03-06 12:48:44 +01:00
Aladdin Persson
138f3f34a4 update readmes, added pix2pix 2021-03-06 12:48:16 +01:00
Aladdin Persson
6967774a79 update readmes, added pix2pix 2021-03-06 12:46:51 +01:00
Aladdin Persson
792f4bbb9e update readmes, added pix2pix 2021-03-06 12:46:11 +01:00
Aladdin Persson
9ae1ff3db0 update readmes, added pix2pix 2021-03-06 12:41:03 +01:00
Darveen Vijayan
1ba55b7382 Update pytorch_inceptionet.py 2021-02-10 19:14:55 +08:00
258 changed files with 54425 additions and 3905 deletions

12
.github/FUNDING.yml vendored
View File

@@ -1,12 +0,0 @@
# These are supported funding model platforms
github: # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2]
patreon: aladdinpersson # Replace with a single Patreon username
open_collective: # Replace with a single Open Collective username
ko_fi: # Replace with a single Ko-fi username
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
otechie: # Replace with a single Otechie username
custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']

9
.gitignore vendored
View File

@@ -1,3 +1,12 @@
.idea/ .idea/
ML/Pytorch/more_advanced/image_captioning/flickr8k/ ML/Pytorch/more_advanced/image_captioning/flickr8k/
ML/algorithms/svm/__pycache__/utils.cpython-38.pyc ML/algorithms/svm/__pycache__/utils.cpython-38.pyc
__pycache__/
*.pth.tar
*.DS_STORE
ML/Pytorch/huggingface/train.csv
ML/Pytorch/huggingface/validation.csv
ML/Pytorch/huggingface/test.csv
ML/Pytorch/huggingface/tb_logs/
ML/Pytorch/huggingface/checkpoints/
ML/Pytorch/huggingface/notebooks/

View File

@@ -0,0 +1,48 @@
import torch
import albumentations as A
from albumentations.pytorch import ToTensorV2
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
LEARNING_RATE = 3e-5
WEIGHT_DECAY = 5e-4
BATCH_SIZE = 20
NUM_EPOCHS = 100
NUM_WORKERS = 6
CHECKPOINT_FILE = "b3.pth.tar"
PIN_MEMORY = True
SAVE_MODEL = True
LOAD_MODEL = True
# Data augmentation for images
train_transforms = A.Compose(
[
A.Resize(width=760, height=760),
A.RandomCrop(height=728, width=728),
A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.5),
A.RandomRotate90(p=0.5),
A.Blur(p=0.3),
A.CLAHE(p=0.3),
A.ColorJitter(p=0.3),
A.CoarseDropout(max_holes=12, max_height=20, max_width=20, p=0.3),
A.IAAAffine(shear=30, rotate=0, p=0.2, mode="constant"),
A.Normalize(
mean=[0.3199, 0.2240, 0.1609],
std=[0.3020, 0.2183, 0.1741],
max_pixel_value=255.0,
),
ToTensorV2(),
]
)
val_transforms = A.Compose(
[
A.Resize(height=728, width=728),
A.Normalize(
mean=[0.3199, 0.2240, 0.1609],
std=[0.3020, 0.2183, 0.1741],
max_pixel_value=255.0,
),
ToTensorV2(),
]
)

View File

@@ -0,0 +1,56 @@
import config
import os
import pandas as pd
import numpy as np
from torch.utils.data import Dataset, DataLoader
from PIL import Image
from tqdm import tqdm
class DRDataset(Dataset):
def __init__(self, images_folder, path_to_csv, train=True, transform=None):
super().__init__()
self.data = pd.read_csv(path_to_csv)
self.images_folder = images_folder
self.image_files = os.listdir(images_folder)
self.transform = transform
self.train = train
def __len__(self):
return self.data.shape[0] if self.train else len(self.image_files)
def __getitem__(self, index):
if self.train:
image_file, label = self.data.iloc[index]
else:
# if test simply return -1 for label, I do this in order to
# re-use same dataset class for test set submission later on
image_file, label = self.image_files[index], -1
image_file = image_file.replace(".jpeg", "")
image = np.array(Image.open(os.path.join(self.images_folder, image_file+".jpeg")))
if self.transform:
image = self.transform(image=image)["image"]
return image, label, image_file
if __name__ == "__main__":
"""
Test if everything works ok
"""
dataset = DRDataset(
images_folder="../train/images_resized_650/",
path_to_csv="../train/trainLabels.csv",
transform=config.val_transforms,
)
loader = DataLoader(
dataset=dataset, batch_size=32, num_workers=2, shuffle=True, pin_memory=True
)
for x, label, file in tqdm(loader):
print(x.shape)
print(label.shape)
import sys
sys.exit()

View File

@@ -0,0 +1,82 @@
"""
Tries to remove unnecessary black borders around the images, and
"trim" the images to they take up the entirety of the image.
It's hacky & not very nice but it works :))
"""
import os
import numpy as np
from PIL import Image
import warnings
from multiprocessing import Pool
from tqdm import tqdm
import cv2
def trim(im):
"""
Converts image to grayscale using cv2, then computes binary matrix
of the pixels that are above a certain threshold, then takes out
the first row where a certain percetage of the pixels are above the
threshold will be the first clip point. Same idea for col, max row, max col.
"""
percentage = 0.02
img = np.array(im)
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
im = img_gray > 0.1 * np.mean(img_gray[img_gray != 0])
row_sums = np.sum(im, axis=1)
col_sums = np.sum(im, axis=0)
rows = np.where(row_sums > img.shape[1] * percentage)[0]
cols = np.where(col_sums > img.shape[0] * percentage)[0]
min_row, min_col = np.min(rows), np.min(cols)
max_row, max_col = np.max(rows), np.max(cols)
im_crop = img[min_row : max_row + 1, min_col : max_col + 1]
return Image.fromarray(im_crop)
def resize_maintain_aspect(image, desired_size):
"""
Stole this from some stackoverflow post but can't remember which,
this will add padding to maintain the aspect ratio.
"""
old_size = image.size # old_size[0] is in (width, height) format
ratio = float(desired_size) / max(old_size)
new_size = tuple([int(x * ratio) for x in old_size])
im = image.resize(new_size, Image.ANTIALIAS)
new_im = Image.new("RGB", (desired_size, desired_size))
new_im.paste(im, ((desired_size - new_size[0]) // 2, (desired_size - new_size[1]) // 2))
return new_im
def save_single(args):
img_file, input_path_folder, output_path_folder, output_size = args
image_original = Image.open(os.path.join(input_path_folder, img_file))
image = trim(image_original)
image = resize_maintain_aspect(image, desired_size=output_size[0])
image.save(os.path.join(output_path_folder + img_file))
def fast_image_resize(input_path_folder, output_path_folder, output_size=None):
"""
Uses multiprocessing to make it fast
"""
if not output_size:
warnings.warn("Need to specify output_size! For example: output_size=100")
exit()
if not os.path.exists(output_path_folder):
os.makedirs(output_path_folder)
jobs = [
(file, input_path_folder, output_path_folder, output_size)
for file in os.listdir(input_path_folder)
]
with Pool() as p:
list(tqdm(p.imap_unordered(save_single, jobs), total=len(jobs)))
if __name__ == "__main__":
fast_image_resize("../train/images/", "../train/images_resized_150/", output_size=(150, 150))
fast_image_resize("../test/images/", "../test/images_resized_150/", output_size=(150, 150))

View File

@@ -0,0 +1,125 @@
import torch
from torch import nn, optim
import os
import config
from torch.utils.data import DataLoader
from tqdm import tqdm
from sklearn.metrics import cohen_kappa_score
from efficientnet_pytorch import EfficientNet
from dataset import DRDataset
from torchvision.utils import save_image
from utils import (
load_checkpoint,
save_checkpoint,
check_accuracy,
make_prediction,
get_csv_for_blend,
)
def train_one_epoch(loader, model, optimizer, loss_fn, scaler, device):
losses = []
loop = tqdm(loader)
for batch_idx, (data, targets, _) in enumerate(loop):
# save examples and make sure they look ok with the data augmentation,
# tip is to first set mean=[0,0,0], std=[1,1,1] so they look "normal"
#save_image(data, f"hi_{batch_idx}.png")
data = data.to(device=device)
targets = targets.to(device=device)
# forward
with torch.cuda.amp.autocast():
scores = model(data)
loss = loss_fn(scores, targets.unsqueeze(1).float())
losses.append(loss.item())
# backward
optimizer.zero_grad()
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
loop.set_postfix(loss=loss.item())
print(f"Loss average over epoch: {sum(losses)/len(losses)}")
def main():
train_ds = DRDataset(
images_folder="train/images_preprocessed_1000/",
path_to_csv="train/trainLabels.csv",
transform=config.val_transforms,
)
val_ds = DRDataset(
images_folder="train/images_preprocessed_1000/",
path_to_csv="train/valLabels.csv",
transform=config.val_transforms,
)
test_ds = DRDataset(
images_folder="test/images_preprocessed_1000",
path_to_csv="train/trainLabels.csv",
transform=config.val_transforms,
train=False,
)
test_loader = DataLoader(
test_ds, batch_size=config.BATCH_SIZE, num_workers=6, shuffle=False
)
train_loader = DataLoader(
train_ds,
batch_size=config.BATCH_SIZE,
num_workers=config.NUM_WORKERS,
pin_memory=config.PIN_MEMORY,
shuffle=False,
)
val_loader = DataLoader(
val_ds,
batch_size=config.BATCH_SIZE,
num_workers=2,
pin_memory=config.PIN_MEMORY,
shuffle=False,
)
loss_fn = nn.MSELoss()
model = EfficientNet.from_pretrained("efficientnet-b3")
model._fc = nn.Linear(1536, 1)
model = model.to(config.DEVICE)
optimizer = optim.Adam(model.parameters(), lr=config.LEARNING_RATE, weight_decay=config.WEIGHT_DECAY)
scaler = torch.cuda.amp.GradScaler()
if config.LOAD_MODEL and config.CHECKPOINT_FILE in os.listdir():
load_checkpoint(torch.load(config.CHECKPOINT_FILE), model, optimizer, config.LEARNING_RATE)
# Run after training is done and you've achieved good result
# on validation set, then run train_blend.py file to use information
# about both eyes concatenated
get_csv_for_blend(val_loader, model, "../train/val_blend.csv")
get_csv_for_blend(train_loader, model, "../train/train_blend.csv")
get_csv_for_blend(test_loader, model, "../train/test_blend.csv")
make_prediction(model, test_loader, "submission_.csv")
import sys
sys.exit()
#make_prediction(model, test_loader)
for epoch in range(config.NUM_EPOCHS):
train_one_epoch(train_loader, model, optimizer, loss_fn, scaler, config.DEVICE)
# get on validation
preds, labels = check_accuracy(val_loader, model, config.DEVICE)
print(f"QuadraticWeightedKappa (Validation): {cohen_kappa_score(labels, preds, weights='quadratic')}")
# get on train
#preds, labels = check_accuracy(train_loader, model, config.DEVICE)
#print(f"QuadraticWeightedKappa (Training): {cohen_kappa_score(labels, preds, weights='quadratic')}")
if config.SAVE_MODEL:
checkpoint = {
"state_dict": model.state_dict(),
"optimizer": optimizer.state_dict(),
}
save_checkpoint(checkpoint, filename=f"b3_{epoch}.pth.tar")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,126 @@
import torch
from tqdm import tqdm
import numpy as np
from torch import nn
from torch import optim
from torch.utils.data import DataLoader, Dataset
from utils import save_checkpoint, load_checkpoint, check_accuracy
from sklearn.metrics import cohen_kappa_score
import config
import os
import pandas as pd
def make_prediction(model, loader, file):
preds = []
filenames = []
model.eval()
for x, y, files in tqdm(loader):
x = x.to(config.DEVICE)
with torch.no_grad():
predictions = model(x)
# Convert MSE floats to integer predictions
predictions[predictions < 0.5] = 0
predictions[(predictions >= 0.5) & (predictions < 1.5)] = 1
predictions[(predictions >= 1.5) & (predictions < 2.5)] = 2
predictions[(predictions >= 2.5) & (predictions < 3.5)] = 3
predictions[(predictions >= 3.5) & (predictions < 1000000000000)] = 4
predictions = predictions.long().view(-1)
y = y.view(-1)
preds.append(predictions.cpu().numpy())
filenames += map(list, zip(files[0], files[1]))
filenames = [item for sublist in filenames for item in sublist]
df = pd.DataFrame({"image": filenames, "level": np.concatenate(preds, axis=0)})
df.to_csv(file, index=False)
model.train()
print("Done with predictions")
class MyDataset(Dataset):
def __init__(self, csv_file):
self.csv = pd.read_csv(csv_file)
def __len__(self):
return self.csv.shape[0]
def __getitem__(self, index):
example = self.csv.iloc[index, :]
features = example.iloc[: example.shape[0] - 4].to_numpy().astype(np.float32)
labels = example.iloc[-4:-2].to_numpy().astype(np.int64)
filenames = example.iloc[-2:].values.tolist()
return features, labels, filenames
class MyModel(nn.Module):
def __init__(self):
super().__init__()
self.model = nn.Sequential(
nn.BatchNorm1d((1536 + 1) * 2),
nn.Linear((1536+1) * 2, 500),
nn.BatchNorm1d(500),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(500, 100),
nn.BatchNorm1d(100),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(100, 2),
)
def forward(self, x):
return self.model(x)
if __name__ == "__main__":
model = MyModel().to(config.DEVICE)
ds = MyDataset(csv_file="train/train_blend.csv")
loader = DataLoader(ds, batch_size=256, num_workers=3, pin_memory=True, shuffle=True)
ds_val = MyDataset(csv_file="train/val_blend.csv")
loader_val = DataLoader(
ds_val, batch_size=256, num_workers=3, pin_memory=True, shuffle=True
)
ds_test = MyDataset(csv_file="train/test_blend.csv")
loader_test = DataLoader(
ds_test, batch_size=256, num_workers=2, pin_memory=True, shuffle=False
)
optimizer = optim.Adam(model.parameters(), lr=1e-4, weight_decay=1e-5)
loss_fn = nn.MSELoss()
if config.LOAD_MODEL and "linear.pth.tar" in os.listdir():
load_checkpoint(torch.load("linear.pth.tar"), model, optimizer, lr=1e-4)
model.train()
for _ in range(5):
losses = []
for x, y, files in tqdm(loader_val):
x = x.to(config.DEVICE).float()
y = y.to(config.DEVICE).view(-1).float()
# forward
scores = model(x).view(-1)
loss = loss_fn(scores, y)
losses.append(loss.item())
# backward
optimizer.zero_grad()
loss.backward()
# gradient descent or adam step
optimizer.step()
print(f"Loss: {sum(losses)/len(losses)}")
if config.SAVE_MODEL:
checkpoint = {"state_dict": model.state_dict(), "optimizer": optimizer.state_dict()}
save_checkpoint(checkpoint, filename="linear.pth.tar")
preds, labels = check_accuracy(loader_val, model)
print(cohen_kappa_score(labels, preds, weights="quadratic"))
preds, labels = check_accuracy(loader, model)
print(cohen_kappa_score(labels, preds, weights="quadratic"))
make_prediction(model, loader_test, "test_preds.csv")

View File

@@ -0,0 +1,128 @@
import torch
import pandas as pd
import numpy as np
import config
from tqdm import tqdm
import warnings
import torch.nn.functional as F
def make_prediction(model, loader, output_csv="submission.csv"):
preds = []
filenames = []
model.eval()
for x, y, files in tqdm(loader):
x = x.to(config.DEVICE)
with torch.no_grad():
predictions = model(x)
# Convert MSE floats to integer predictions
predictions[predictions < 0.5] = 0
predictions[(predictions >= 0.5) & (predictions < 1.5)] = 1
predictions[(predictions >= 1.5) & (predictions < 2.5)] = 2
predictions[(predictions >= 2.5) & (predictions < 3.5)] = 3
predictions[(predictions >= 3.5) & (predictions < 10000000)] = 4
predictions = predictions.long().squeeze(1)
preds.append(predictions.cpu().numpy())
filenames += files
df = pd.DataFrame({"image": filenames, "level": np.concatenate(preds, axis=0)})
df.to_csv(output_csv, index=False)
model.train()
print("Done with predictions")
def check_accuracy(loader, model, device="cuda"):
model.eval()
all_preds, all_labels = [], []
num_correct = 0
num_samples = 0
for x, y, filename in tqdm(loader):
x = x.to(device=device)
y = y.to(device=device)
with torch.no_grad():
predictions = model(x)
# Convert MSE floats to integer predictions
predictions[predictions < 0.5] = 0
predictions[(predictions >= 0.5) & (predictions < 1.5)] = 1
predictions[(predictions >= 1.5) & (predictions < 2.5)] = 2
predictions[(predictions >= 2.5) & (predictions < 3.5)] = 3
predictions[(predictions >= 3.5) & (predictions < 100)] = 4
predictions = predictions.long().view(-1)
y = y.view(-1)
num_correct += (predictions == y).sum()
num_samples += predictions.shape[0]
# add to lists
all_preds.append(predictions.detach().cpu().numpy())
all_labels.append(y.detach().cpu().numpy())
print(
f"Got {num_correct} / {num_samples} with accuracy {float(num_correct) / float(num_samples) * 100:.2f}"
)
model.train()
return np.concatenate(all_preds, axis=0, dtype=np.int64), np.concatenate(
all_labels, axis=0, dtype=np.int64
)
def save_checkpoint(state, filename="my_checkpoint.pth.tar"):
print("=> Saving checkpoint")
torch.save(state, filename)
def load_checkpoint(checkpoint, model, optimizer, lr):
print("=> Loading checkpoint")
model.load_state_dict(checkpoint["state_dict"])
#optimizer.load_state_dict(checkpoint["optimizer"])
# If we don't do this then it will just have learning rate of old checkpoint
# and it will lead to many hours of debugging \:
for param_group in optimizer.param_groups:
param_group["lr"] = lr
def get_csv_for_blend(loader, model, output_csv_file):
warnings.warn("Important to have shuffle=False (and to ensure batch size is even size) when running get_csv_for_blend also set val_transforms to train_loader!")
model.eval()
filename_first = []
filename_second = []
labels_first = []
labels_second = []
all_features = []
for idx, (images, y, image_files) in enumerate(tqdm(loader)):
images = images.to(config.DEVICE)
with torch.no_grad():
features = F.adaptive_avg_pool2d(
model.extract_features(images), output_size=1
)
features_logits = features.reshape(features.shape[0] // 2, 2, features.shape[1])
preds = model(images).reshape(images.shape[0] // 2, 2, 1)
new_features = (
torch.cat([features_logits, preds], dim=2)
.view(preds.shape[0], -1)
.cpu()
.numpy()
)
all_features.append(new_features)
filename_first += image_files[::2]
filename_second += image_files[1::2]
labels_first.append(y[::2].cpu().numpy())
labels_second.append(y[1::2].cpu().numpy())
all_features = np.concatenate(all_features, axis=0)
df = pd.DataFrame(
data=all_features, columns=[f"f_{idx}" for idx in range(all_features.shape[1])]
)
df["label_first"] = np.concatenate(labels_first, axis=0)
df["label_second"] = np.concatenate(labels_second, axis=0)
df["file_first"] = filename_first
df["file_second"] = filename_second
df.to_csv(output_csv_file, index=False)
model.train()

View File

@@ -0,0 +1,119 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "51c78b68",
"metadata": {},
"outputs": [],
"source": [
"import sklearn\n",
"import pandas as pd\n",
"import numpy as np\n",
"from sklearn.linear_model import LogisticRegression\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.metrics import log_loss"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "4421a043",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Training data shape: (25000, 2560), labels shape: (25000,)\n"
]
},
{
"data": {
"text/plain": [
"LogisticRegression(max_iter=2000)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X = np.load(f'data_features/X_train_b7.npy')\n",
"y = np.load(f'data_features/y_train_b7.npy')\n",
"\n",
"# Split data and train classifier\n",
"print(f\"Training data shape: {X.shape}, labels shape: {y.shape}\")\n",
"X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.001, random_state=1337)\n",
"clf = LogisticRegression(max_iter=2000)\n",
"clf.fit(X_train, y_train)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d5cfc5b0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"On validation set:\n",
"Accuracy: 1.0\n",
"LOG LOSS: 7.980845755748817e-05 \n",
"%--------------------------------------------------%\n",
"Getting predictions for test set\n",
"Done getting predictions!\n"
]
}
],
"source": [
"# Check on validation\n",
"val_preds= clf.predict_proba(X_val)[:,1]\n",
"print(f\"On validation set:\")\n",
"print(f\"Accuracy: {clf.score(X_val, y_val)}\")\n",
"print(f\"LOG LOSS: {log_loss(y_val, val_preds)} \")\n",
"print(\"%--------------------------------------------------%\")\n",
"\n",
"# Get predictions on test set\n",
"print(\"Getting predictions for test set\")\n",
"X_test = np.load(f'data_features/X_test_b7.npy')\n",
"X_test_preds = clf.predict_proba(X_test)[:,1]\n",
"df = pd.DataFrame({'id': np.arange(1, 12501), 'label': np.clip(X_test_preds, 0.005, 0.995)})\n",
"df.to_csv(f\"submissions/mysubmission.csv\", index=False)\n",
"print(\"Done getting predictions!\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a9cce7af",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1 @@
https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition

View File

@@ -0,0 +1,26 @@
import torch
import albumentations as A
from albumentations.pytorch import ToTensorV2
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
NUM_WORKERS = 4
BATCH_SIZE = 20
PIN_MEMORY = True
LOAD_MODEL = True
SAVE_MODEL = True
CHECKPOINT_FILE = "b7.pth.tar"
WEIGHT_DECAY = 1e-4
LEARNING_RATE = 1e-4
NUM_EPOCHS = 1
basic_transform = A.Compose(
[
A.Resize(height=448, width=448),
A.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
max_pixel_value=255.0,
),
ToTensorV2(),
]
)

View File

@@ -0,0 +1,32 @@
import os
import re
import numpy as np
from torch.utils.data import Dataset
from PIL import Image
class CatDog(Dataset):
def __init__(self, root, transform=None):
self.images = os.listdir(root)
self.images.sort(key=lambda x: int(re.findall(r"\d+", x)[0]))
self.root = root
self.transform = transform
def __len__(self):
return len(self.images)
def __getitem__(self, index):
file = self.images[index]
img = np.array(Image.open(os.path.join(self.root, file)))
if self.transform is not None:
img = self.transform(image=img)["image"]
if "dog" in file:
label = 1
elif "cat" in file:
label = 0
else:
label = -1
return img, label

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,93 @@
# Imports
import os
import torch
import torch.nn.functional as F
import numpy as np
import config
from torch import nn, optim
from torch.utils.data import DataLoader
from tqdm import tqdm
from dataset import CatDog
from efficientnet_pytorch import EfficientNet
from utils import check_accuracy, load_checkpoint, save_checkpoint
def save_feature_vectors(model, loader, output_size=(1, 1), file="trainb7"):
model.eval()
images, labels = [], []
for idx, (x, y) in enumerate(tqdm(loader)):
x = x.to(config.DEVICE)
with torch.no_grad():
features = model.extract_features(x)
features = F.adaptive_avg_pool2d(features, output_size=output_size)
images.append(features.reshape(x.shape[0], -1).detach().cpu().numpy())
labels.append(y.numpy())
np.save(f"data_features/X_{file}.npy", np.concatenate(images, axis=0))
np.save(f"data_features/y_{file}.npy", np.concatenate(labels, axis=0))
model.train()
def train_one_epoch(loader, model, loss_fn, optimizer, scaler):
loop = tqdm(loader)
for batch_idx, (data, targets) in enumerate(loop):
data = data.to(config.DEVICE)
targets = targets.to(config.DEVICE).unsqueeze(1).float()
with torch.cuda.amp.autocast():
scores = model(data)
loss = loss_fn(scores, targets)
optimizer.zero_grad()
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
loop.set_postfix(loss=loss.item())
def main():
model = EfficientNet.from_pretrained("efficientnet-b7")
model._fc = nn.Linear(2560, 1)
train_dataset = CatDog(root="data/train/", transform=config.basic_transform)
test_dataset = CatDog(root="data/test/", transform=config.basic_transform)
train_loader = DataLoader(
train_dataset,
shuffle=True,
batch_size=config.BATCH_SIZE,
num_workers=config.NUM_WORKERS,
pin_memory=True,
)
test_loader = DataLoader(
test_dataset,
shuffle=False,
batch_size=config.BATCH_SIZE,
num_workers=config.NUM_WORKERS,
)
model = model.to(config.DEVICE)
scaler = torch.cuda.amp.GradScaler()
loss_fn = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(
model.parameters(), lr=config.LEARNING_RATE, weight_decay=config.WEIGHT_DECAY
)
if config.LOAD_MODEL and config.CHECKPOINT_FILE in os.listdir():
load_checkpoint(torch.load(config.CHECKPOINT_FILE), model)
for epoch in range(config.NUM_EPOCHS):
train_one_epoch(train_loader, model, loss_fn, optimizer, scaler)
check_accuracy(train_loader, model, loss_fn)
if config.SAVE_MODEL:
checkpoint = {"state_dict": model.state_dict(), "optimizer": optimizer.state_dict()}
save_checkpoint(checkpoint, filename=config.CHECKPOINT_FILE)
save_feature_vectors(model, train_loader, output_size=(1, 1), file="train_b7")
save_feature_vectors(model, test_loader, output_size=(1, 1), file="test_b7")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,192 @@
import torch
import os
import pandas as pd
import numpy as np
import albumentations as A
from albumentations.pytorch import ToTensorV2
import config
from tqdm import tqdm
from dataset import CatDog
from torch.utils.data import DataLoader
from sklearn.metrics import log_loss
def check_accuracy(
loader, model, loss_fn, input_shape=None, toggle_eval=True, print_accuracy=True
):
"""
Check accuracy of model on data from loader
"""
if toggle_eval:
model.eval()
device = next(model.parameters()).device
num_correct = 0
num_samples = 0
y_preds = []
y_true = []
with torch.no_grad():
for x, y in loader:
x = x.to(device=device)
y = y.to(device=device)
if input_shape:
x = x.reshape(x.shape[0], *input_shape)
scores = model(x)
predictions = torch.sigmoid(scores) > 0.5
y_preds.append(torch.clip(torch.sigmoid(scores), 0.005, 0.995).cpu().numpy())
y_true.append(y.cpu().numpy())
num_correct += (predictions.squeeze(1) == y).sum()
num_samples += predictions.size(0)
accuracy = num_correct / num_samples
if toggle_eval:
model.train()
if print_accuracy:
print(f"Accuracy: {accuracy * 100:.2f}%")
print(log_loss(np.concatenate(y_true, axis=0), np.concatenate(y_preds, axis=0)))
return accuracy
def save_checkpoint(state, filename="my_checkpoint.pth.tar"):
print("=> Saving checkpoint")
torch.save(state, filename)
def load_checkpoint(checkpoint, model):
print("=> Loading checkpoint")
model.load_state_dict(checkpoint["state_dict"])
def create_submission(model, model_name, files_dir):
my_transforms = {
"base": A.Compose(
[
A.Resize(height=240, width=240),
A.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
max_pixel_value=255.0,
),
ToTensorV2(),
]
),
"horizontal_flip": A.Compose(
[
A.Resize(height=240, width=240),
A.HorizontalFlip(p=1.0),
A.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
max_pixel_value=255.0,
),
ToTensorV2(),
]
),
"vertical_flip": A.Compose(
[
A.Resize(height=240, width=240),
A.VerticalFlip(p=1.0),
A.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
max_pixel_value=255.0,
),
ToTensorV2(),
]
),
"coloring": A.Compose(
[
A.Resize(height=240, width=240),
A.ColorJitter(p=1.0),
A.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
max_pixel_value=255.0,
),
ToTensorV2(),
]
),
"rotate": A.Compose(
[
A.Resize(height=240, width=240),
A.Rotate(p=1.0, limit=45),
A.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
max_pixel_value=255.0,
),
ToTensorV2(),
]
),
"shear": A.Compose(
[
A.Resize(height=240, width=240),
A.IAAAffine(p=1.0),
A.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
max_pixel_value=255.0,
),
ToTensorV2(),
]
),
}
for t in ["base", "horizontal_flip", "vertical_flip", "coloring", "rotate", "shear"]:
predictions = []
labels = []
all_files = []
test_dataset = MyDataset(root=files_dir, transform=my_transforms[t])
test_loader = DataLoader(
test_dataset, batch_size=32, num_workers=4, shuffle=False, pin_memory=True
)
model.eval()
for idx, (x, y, filenames) in enumerate(tqdm(test_loader)):
x = x.to(config.DEVICE)
with torch.no_grad():
outputs = (
torch.clip(torch.sigmoid(model(x)), 0.005, 0.995).squeeze(1).cpu().numpy()
)
predictions.append(outputs)
labels += y.numpy().tolist()
all_files += filenames
df = pd.DataFrame(
{
"id": np.arange(
1,
(len(predictions) - 1) * predictions[0].shape[0]
+ predictions[-1].shape[0]
+ 1,
),
"label": np.concatenate(predictions, axis=0),
}
)
df.to_csv(f"predictions_test/submission_{model_name}_{t}.csv", index=False)
model.train()
print(f"Created submission file for model {model_name} and transform {t}")
def blending_ensemble_data():
pred_csvs = []
root_dir = "predictions_validation/"
for file in os.listdir(root_dir):
if "label" not in file:
df = pd.read_csv(root_dir + "/" + file)
pred_csvs.append(df)
else:
label_csv = pd.read_csv(root_dir + "/" + file)
all_preds = pd.concat(pred_csvs, axis=1)
print(all_preds)
if __name__ == "__main__":
blending_ensemble_data()

View File

@@ -0,0 +1,61 @@
import torch
import albumentations as A
from albumentations.pytorch import ToTensorV2
import cv2
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
LEARNING_RATE = 1e-4
WEIGHT_DECAY = 5e-4
BATCH_SIZE = 64
NUM_EPOCHS = 100
NUM_WORKERS = 4
CHECKPOINT_FILE = "b0_4.pth.tar"
PIN_MEMORY = True
SAVE_MODEL = True
LOAD_MODEL = True
# Data augmentation for images
train_transforms = A.Compose(
[
A.Resize(width=96, height=96),
A.Rotate(limit=15, border_mode=cv2.BORDER_CONSTANT, p=0.8),
A.IAAAffine(shear=15, scale=1.0, mode="constant", p=0.2),
A.RandomBrightnessContrast(contrast_limit=0.5, brightness_limit=0.5, p=0.2),
A.OneOf([
A.GaussNoise(p=0.8),
A.CLAHE(p=0.8),
A.ImageCompression(p=0.8),
A.RandomGamma(p=0.8),
A.Posterize(p=0.8),
A.Blur(p=0.8),
], p=1.0),
A.OneOf([
A.GaussNoise(p=0.8),
A.CLAHE(p=0.8),
A.ImageCompression(p=0.8),
A.RandomGamma(p=0.8),
A.Posterize(p=0.8),
A.Blur(p=0.8),
], p=1.0),
A.ShiftScaleRotate(shift_limit=0.1, scale_limit=0.1, rotate_limit=0, p=0.2, border_mode=cv2.BORDER_CONSTANT),
A.Normalize(
mean=[0.4897, 0.4897, 0.4897],
std=[0.2330, 0.2330, 0.2330],
max_pixel_value=255.0,
),
ToTensorV2(),
], keypoint_params=A.KeypointParams(format="xy", remove_invisible=False),
)
val_transforms = A.Compose(
[
A.Resize(height=96, width=96),
A.Normalize(
mean=[0.4897, 0.4897, 0.4897],
std=[0.2330, 0.2330, 0.2330],
max_pixel_value=255.0,
),
ToTensorV2(),
], keypoint_params=A.KeypointParams(format="xy", remove_invisible=False),
)

View File

@@ -0,0 +1,50 @@
import pandas as pd
import numpy as np
import config
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader, Dataset
class FacialKeypointDataset(Dataset):
def __init__(self, csv_file, train=True, transform=None):
super().__init__()
self.data = pd.read_csv(csv_file)
self.category_names = ['left_eye_center_x', 'left_eye_center_y', 'right_eye_center_x', 'right_eye_center_y', 'left_eye_inner_corner_x', 'left_eye_inner_corner_y', 'left_eye_outer_corner_x', 'left_eye_outer_corner_y', 'right_eye_inner_corner_x', 'right_eye_inner_corner_y', 'right_eye_outer_corner_x', 'right_eye_outer_corner_y', 'left_eyebrow_inner_end_x', 'left_eyebrow_inner_end_y', 'left_eyebrow_outer_end_x', 'left_eyebrow_outer_end_y', 'right_eyebrow_inner_end_x', 'right_eyebrow_inner_end_y', 'right_eyebrow_outer_end_x', 'right_eyebrow_outer_end_y', 'nose_tip_x', 'nose_tip_y', 'mouth_left_corner_x', 'mouth_left_corner_y', 'mouth_right_corner_x', 'mouth_right_corner_y', 'mouth_center_top_lip_x', 'mouth_center_top_lip_y', 'mouth_center_bottom_lip_x', 'mouth_center_bottom_lip_y']
self.transform = transform
self.train = train
def __len__(self):
return self.data.shape[0]
def __getitem__(self, index):
if self.train:
image = np.array(self.data.iloc[index, 30].split()).astype(np.float32)
labels = np.array(self.data.iloc[index, :30].tolist())
labels[np.isnan(labels)] = -1
else:
image = np.array(self.data.iloc[index, 1].split()).astype(np.float32)
labels = np.zeros(30)
ignore_indices = labels == -1
labels = labels.reshape(15, 2)
if self.transform:
image = np.repeat(image.reshape(96, 96, 1), 3, 2).astype(np.uint8)
augmentations = self.transform(image=image, keypoints=labels)
image = augmentations["image"]
labels = augmentations["keypoints"]
labels = np.array(labels).reshape(-1)
labels[ignore_indices] = -1
return image, labels.astype(np.float32)
if __name__ == "__main__":
ds = FacialKeypointDataset(csv_file="data/train_4.csv", train=True, transform=config.train_transforms)
loader = DataLoader(ds, batch_size=1, shuffle=True, num_workers=0)
for idx, (x, y) in enumerate(loader):
plt.imshow(x[0][0].detach().cpu().numpy(), cmap='gray')
plt.plot(y[0][0::2].detach().cpu().numpy(), y[0][1::2].detach().cpu().numpy(), "go")
plt.show()

View File

@@ -0,0 +1,19 @@
import numpy as np
import pandas as pd
import os
from PIL import Image
def extract_images_from_csv(csv, column, save_folder, resize=(96, 96)):
if not os.path.exists(save_folder):
os.makedirs(save_folder)
for idx, image in enumerate(csv[column]):
image = np.array(image.split()).astype(np.uint8)
image = image.reshape(resize[0], resize[1])
img = Image.fromarray(image, 'L')
img.save(save_folder+f"img_{idx}.png")
csv = pd.read_csv("test.csv")
extract_images_from_csv(csv, "Image", "data/test/")

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,111 @@
import torch
from dataset import FacialKeypointDataset
from torch import nn, optim
import os
import config
from torch.utils.data import DataLoader
from tqdm import tqdm
from efficientnet_pytorch import EfficientNet
from utils import (
load_checkpoint,
save_checkpoint,
get_rmse,
get_submission
)
def train_one_epoch(loader, model, optimizer, loss_fn, scaler, device):
losses = []
loop = tqdm(loader)
num_examples = 0
for batch_idx, (data, targets) in enumerate(loop):
data = data.to(device=device)
targets = targets.to(device=device)
# forward
scores = model(data)
scores[targets == -1] = -1
loss = loss_fn(scores, targets)
num_examples += torch.numel(scores[targets != -1])
losses.append(loss.item())
# backward
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(f"Loss average over epoch: {(sum(losses)/num_examples)**0.5}")
def main():
train_ds = FacialKeypointDataset(
csv_file="data/train_4.csv",
transform=config.train_transforms,
)
train_loader = DataLoader(
train_ds,
batch_size=config.BATCH_SIZE,
num_workers=config.NUM_WORKERS,
pin_memory=config.PIN_MEMORY,
shuffle=True,
)
val_ds = FacialKeypointDataset(
transform=config.val_transforms,
csv_file="data/val_4.csv",
)
val_loader = DataLoader(
val_ds,
batch_size=config.BATCH_SIZE,
num_workers=config.NUM_WORKERS,
pin_memory=config.PIN_MEMORY,
shuffle=False,
)
test_ds = FacialKeypointDataset(
csv_file="data/test.csv",
transform=config.val_transforms,
train=False,
)
test_loader = DataLoader(
test_ds,
batch_size=1,
num_workers=config.NUM_WORKERS,
pin_memory=config.PIN_MEMORY,
shuffle=False,
)
loss_fn = nn.MSELoss(reduction="sum")
model = EfficientNet.from_pretrained("efficientnet-b0")
model._fc = nn.Linear(1280, 30)
model = model.to(config.DEVICE)
optimizer = optim.Adam(model.parameters(), lr=config.LEARNING_RATE, weight_decay=config.WEIGHT_DECAY)
scaler = torch.cuda.amp.GradScaler()
model_4 = EfficientNet.from_pretrained("efficientnet-b0")
model_4._fc = nn.Linear(1280, 30)
model_15 = EfficientNet.from_pretrained("efficientnet-b0")
model_15._fc = nn.Linear(1280, 30)
model_4 = model_4.to(config.DEVICE)
model_15 = model_15.to(config.DEVICE)
if config.LOAD_MODEL and config.CHECKPOINT_FILE in os.listdir():
load_checkpoint(torch.load(config.CHECKPOINT_FILE), model, optimizer, config.LEARNING_RATE)
load_checkpoint(torch.load("b0_4.pth.tar"), model_4, optimizer, config.LEARNING_RATE)
load_checkpoint(torch.load("b0_15.pth.tar"), model_15, optimizer, config.LEARNING_RATE)
get_submission(test_loader, test_ds, model_15, model_4)
for epoch in range(config.NUM_EPOCHS):
get_rmse(val_loader, model, loss_fn, config.DEVICE)
train_one_epoch(train_loader, model, optimizer, loss_fn, scaler, config.DEVICE)
# get on validation
if config.SAVE_MODEL:
checkpoint = {
"state_dict": model.state_dict(),
"optimizer": optimizer.state_dict(),
}
save_checkpoint(checkpoint, filename=config.CHECKPOINT_FILE)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,71 @@
import torch
import numpy as np
import config
import pandas as pd
from tqdm import tqdm
def get_submission(loader, dataset, model_15, model_4):
"""
This can be done a lot faster.. but it didn't take
too much time to do it in this inefficient way
"""
model_15.eval()
model_4.eval()
id_lookup = pd.read_csv("data/IdLookupTable.csv")
predictions = []
image_id = 1
for image, label in tqdm(loader):
image = image.to(config.DEVICE)
preds_15 = torch.clip(model_15(image).squeeze(0), 0.0, 96.0)
preds_4 = torch.clip(model_4(image).squeeze(0), 0.0, 96.0)
feature_names = id_lookup.loc[id_lookup["ImageId"] == image_id]["FeatureName"]
for feature_name in feature_names:
feature_index = dataset.category_names.index(feature_name)
if feature_names.shape[0] < 10:
predictions.append(preds_4[feature_index].item())
else:
predictions.append(preds_15[feature_index].item())
image_id += 1
df = pd.DataFrame({"RowId": np.arange(1, len(predictions)+1), "Location": predictions})
df.to_csv("submission.csv", index=False)
model_15.train()
model_4.train()
def get_rmse(loader, model, loss_fn, device):
model.eval()
num_examples = 0
losses = []
for batch_idx, (data, targets) in enumerate(loader):
data = data.to(device=device)
targets = targets.to(device=device)
# forward
scores = model(data)
loss = loss_fn(scores[targets != -1], targets[targets != -1])
num_examples += scores[targets != -1].shape[0]
losses.append(loss.item())
model.train()
print(f"Loss on val: {(sum(losses)/num_examples)**0.5}")
def save_checkpoint(state, filename="my_checkpoint.pth.tar"):
print("=> Saving checkpoint")
torch.save(state, filename)
def load_checkpoint(checkpoint, model, optimizer, lr):
print("=> Loading checkpoint")
model.load_state_dict(checkpoint["state_dict"])
optimizer.load_state_dict(checkpoint["optimizer"])
# If we don't do this then it will just have learning rate of old checkpoint
# and it will lead to many hours of debugging \:
for param_group in optimizer.param_groups:
param_group["lr"] = lr

Binary file not shown.

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 410 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 180 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 155 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 100 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 184 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 100 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 128 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 298 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 846 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 271 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 424 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

View File

@@ -0,0 +1,93 @@
"""
This code is for dealing with imbalanced datasets in PyTorch. Imbalanced datasets
are those where the number of samples in one or more classes is significantly lower
than the number of samples in the other classes. This can be a problem because it
can lead to a model that is biased towards the more common classes, which can result
in poor performance on the less common classes.
To deal with imbalanced datasets, this code implements two methods: oversampling and
class weighting.
Oversampling involves generating additional samples for the underrepresented classes,
while class weighting involves assigning higher weights to the loss of samples in the
underrepresented classes, so that the model pays more attention to them.
In this code, the get_loader function takes a root directory for a dataset and a batch
size, and returns a PyTorch data loader. The data loader is used to iterate over the
dataset in batches. The get_loader function first applies some transformations to the
images in the dataset using the transforms module from torchvision. Then it calculates
the class weights based on the number of samples in each class. It then creates a
WeightedRandomSampler object, which is used to randomly select a batch of samples with a
probability proportional to their weights. Finally, it creates the data loader using the
dataset and the weighted random sampler.
The main function then uses the data loader to iterate over the dataset for 10 epochs,
and counts the number of samples in each class. Finally, it prints the counts for each class.
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-04-08: Initial coding
* 2021-03-24: Added more detailed comments also removed part of
check_accuracy which would only work specifically on MNIST.
* 2022-12-19: Updated detailed comments, small code revision, checked code still works with latest PyTorch.
"""
import torch
import torchvision.datasets as datasets
import os
from torch.utils.data import WeightedRandomSampler, DataLoader
import torchvision.transforms as transforms
import torch.nn as nn
# Methods for dealing with imbalanced datasets:
# 1. Oversampling (probably preferable)
# 2. Class weighting
def get_loader(root_dir, batch_size):
my_transforms = transforms.Compose(
[
transforms.Resize((224, 224)),
transforms.ToTensor(),
]
)
dataset = datasets.ImageFolder(root=root_dir, transform=my_transforms)
subdirectories = dataset.classes
class_weights = []
# loop through each subdirectory and calculate the class weight
# that is 1 / len(files) in that subdirectory
for subdir in subdirectories:
files = os.listdir(os.path.join(root_dir, subdir))
class_weights.append(1 / len(files))
sample_weights = [0] * len(dataset)
for idx, (data, label) in enumerate(dataset):
class_weight = class_weights[label]
sample_weights[idx] = class_weight
sampler = WeightedRandomSampler(
sample_weights, num_samples=len(sample_weights), replacement=True
)
loader = DataLoader(dataset, batch_size=batch_size, sampler=sampler)
return loader
def main():
loader = get_loader(root_dir="dataset", batch_size=8)
num_retrievers = 0
num_elkhounds = 0
for epoch in range(10):
for data, labels in loader:
num_retrievers += torch.sum(labels == 0)
num_elkhounds += torch.sum(labels == 1)
print(num_retrievers.item())
print(num_elkhounds.item())
if __name__ == "__main__":
main()

View File

@@ -3,6 +3,7 @@ import albumentations as A
import numpy as np import numpy as np
from utils import plot_examples from utils import plot_examples
from PIL import Image from PIL import Image
from tqdm import tqdm
image = Image.open("images/elon.jpeg") image = Image.open("images/elon.jpeg")
@@ -14,18 +15,20 @@ transform = A.Compose(
A.HorizontalFlip(p=0.5), A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.1), A.VerticalFlip(p=0.1),
A.RGBShift(r_shift_limit=25, g_shift_limit=25, b_shift_limit=25, p=0.9), A.RGBShift(r_shift_limit=25, g_shift_limit=25, b_shift_limit=25, p=0.9),
A.OneOf([ A.OneOf(
A.Blur(blur_limit=3, p=0.5), [
A.ColorJitter(p=0.5), A.Blur(blur_limit=3, p=0.5),
], p=1.0), A.ColorJitter(p=0.5),
],
p=1.0,
),
] ]
) )
images_list = [image] images_list = [image]
image = np.array(image) image = np.array(image)
for i in range(15): for i in tqdm(range(15)):
augmentations = transform(image=image) augmentations = transform(image=image)
augmented_img = augmentations["image"] augmented_img = augmentations["image"]
images_list.append(augmented_img) images_list.append(augmented_img)
plot_examples(images_list) plot_examples(images_list)

View File

@@ -8,6 +8,7 @@ from albumentations.pytorch import ToTensorV2
from torch.utils.data import Dataset from torch.utils.data import Dataset
import os import os
class ImageFolder(Dataset): class ImageFolder(Dataset):
def __init__(self, root_dir, transform=None): def __init__(self, root_dir, transform=None):
super(ImageFolder, self).__init__() super(ImageFolder, self).__init__()
@@ -18,7 +19,7 @@ class ImageFolder(Dataset):
for index, name in enumerate(self.class_names): for index, name in enumerate(self.class_names):
files = os.listdir(os.path.join(root_dir, name)) files = os.listdir(os.path.join(root_dir, name))
self.data += list(zip(files, [index]*len(files))) self.data += list(zip(files, [index] * len(files)))
def __len__(self): def __len__(self):
return len(self.data) return len(self.data)
@@ -43,10 +44,13 @@ transform = A.Compose(
A.HorizontalFlip(p=0.5), A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.1), A.VerticalFlip(p=0.1),
A.RGBShift(r_shift_limit=25, g_shift_limit=25, b_shift_limit=25, p=0.9), A.RGBShift(r_shift_limit=25, g_shift_limit=25, b_shift_limit=25, p=0.9),
A.OneOf([ A.OneOf(
A.Blur(blur_limit=3, p=0.5), [
A.ColorJitter(p=0.5), A.Blur(blur_limit=3, p=0.5),
], p=1.0), A.ColorJitter(p=0.5),
],
p=1.0,
),
A.Normalize( A.Normalize(
mean=[0, 0, 0], mean=[0, 0, 0],
std=[1, 1, 1], std=[1, 1, 1],
@@ -58,5 +62,5 @@ transform = A.Compose(
dataset = ImageFolder(root_dir="cat_dogs", transform=transform) dataset = ImageFolder(root_dir="cat_dogs", transform=transform)
for x,y in dataset: for x, y in dataset:
print(x.shape) print(x.shape)

View File

@@ -8,7 +8,7 @@ import albumentations as A
def visualize(image): def visualize(image):
plt.figure(figsize=(10, 10)) plt.figure(figsize=(10, 10))
plt.axis('off') plt.axis("off")
plt.imshow(image) plt.imshow(image)
plt.show() plt.show()
@@ -22,7 +22,7 @@ def plot_examples(images, bboxes=None):
if bboxes is not None: if bboxes is not None:
img = visualize_bbox(images[i - 1], bboxes[i - 1], class_name="Elon") img = visualize_bbox(images[i - 1], bboxes[i - 1], class_name="Elon")
else: else:
img = images[i-1] img = images[i - 1]
fig.add_subplot(rows, columns, i) fig.add_subplot(rows, columns, i)
plt.imshow(img) plt.imshow(img)
plt.show() plt.show()

View File

@@ -1,131 +0,0 @@
# Imports
import os
from typing import Union
import torch.nn.functional as F # All functions that don't have any parameters
import pandas as pd
import torch
import torch.nn as nn # All neural network modules, nn.Linear, nn.Conv2d, BatchNorm, Loss functions
import torch.optim as optim # For all Optimization algorithms, SGD, Adam, etc.
import torchvision
import torchvision.transforms as transforms # Transformations we can perform on our dataset
from pandas import io
# from skimage import io
from torch.utils.data import (
Dataset,
DataLoader,
) # Gives easier dataset managment and creates mini batches
import torch.nn as nn # All neural network modules, nn.Linear, nn.Conv2d, BatchNorm, Loss functions
# Create Fully Connected Network
class NN(nn.Module):
def __init__(self, input_size, num_classes):
super(NN, self).__init__()
self.fc1 = nn.Linear(input_size, 50)
self.fc2 = nn.Linear(50, num_classes)
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
class SoloDataset(Dataset):
def __init__(self, csv_file, root_dir, transform=None):
self.annotations = pd.read_csv(csv_file)
self.root_dir = root_dir
self.transform = transform
def __len__(self):
return len(self.annotations)
def __getitem__(self, index):
x_data = self.annotations.iloc[index, 0:11]
x_data = torch.tensor(x_data)
y_label = torch.tensor(int(self.annotations.iloc[index, 11]))
return (x_data.float(), y_label)
# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Hyperparameters
num_classes = 26
learning_rate = 1e-3
batch_size = 5
num_epochs = 30
input_size = 11
# Load Data
dataset = SoloDataset(
csv_file="power.csv", root_dir="test123", transform=transforms.ToTensor()
)
train_set, test_set = torch.utils.data.random_split(dataset, [2900, 57])
train_loader = DataLoader(dataset=train_set, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_set, batch_size=batch_size, shuffle=True)
# Model
model = NN(input_size=input_size, num_classes=num_classes).to(device)
# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
print(len(train_set))
print(len(test_set))
# Train Network
for epoch in range(num_epochs):
losses = []
for batch_idx, (data, targets) in enumerate(train_loader):
# Get data to cuda if possible
data = data.to(device=device)
targets = targets.to(device=device)
# forward
scores = model(data)
loss = criterion(scores, targets)
losses.append(loss.item())
# backward
optimizer.zero_grad()
loss.backward()
# gradient descent or adam step
optimizer.step()
print(f"Cost at epoch {epoch} is {sum(losses) / len(losses)}")
# Check accuracy on training to see how good our model is
def check_accuracy(loader, model):
num_correct = 0
num_samples = 0
model.eval()
with torch.no_grad():
for x, y in loader:
x = x.to(device=device)
y = y.to(device=device)
scores = model(x)
_, predictions = scores.max(1)
num_correct += (predictions == y).sum()
num_samples += predictions.size(0)
print(
f"Got {num_correct} / {num_samples} with accuracy {float(num_correct) / float(num_samples) * 100:.2f}"
)
model.train()
print("Checking accuracy on Training Set")
check_accuracy(train_loader, model)
print("Checking accuracy on Test Set")
check_accuracy(test_loader, model)

View File

@@ -6,7 +6,7 @@ label (0 for cat, 1 for dog).
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com> Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-04-03 Initial coding * 2020-04-03 Initial coding
* 2022-12-19 Updated with better comments, improved code using PIL, and checked code still functions as intended.
""" """
# Imports # Imports
@@ -17,7 +17,7 @@ import torchvision.transforms as transforms # Transformations we can perform on
import torchvision import torchvision
import os import os
import pandas as pd import pandas as pd
from skimage import io from PIL import Image
from torch.utils.data import ( from torch.utils.data import (
Dataset, Dataset,
DataLoader, DataLoader,
@@ -35,7 +35,7 @@ class CatsAndDogsDataset(Dataset):
def __getitem__(self, index): def __getitem__(self, index):
img_path = os.path.join(self.root_dir, self.annotations.iloc[index, 0]) img_path = os.path.join(self.root_dir, self.annotations.iloc[index, 0])
image = io.imread(img_path) image = Image.open(img_path)
y_label = torch.tensor(int(self.annotations.iloc[index, 1])) y_label = torch.tensor(int(self.annotations.iloc[index, 1]))
if self.transform: if self.transform:
@@ -50,7 +50,7 @@ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Hyperparameters # Hyperparameters
in_channel = 3 in_channel = 3
num_classes = 2 num_classes = 2
learning_rate = 1e-3 learning_rate = 3e-4
batch_size = 32 batch_size = 32
num_epochs = 10 num_epochs = 10
@@ -69,12 +69,19 @@ train_loader = DataLoader(dataset=train_set, batch_size=batch_size, shuffle=True
test_loader = DataLoader(dataset=test_set, batch_size=batch_size, shuffle=True) test_loader = DataLoader(dataset=test_set, batch_size=batch_size, shuffle=True)
# Model # Model
model = torchvision.models.googlenet(pretrained=True) model = torchvision.models.googlenet(weights="DEFAULT")
# freeze all layers, change final linear layer with num_classes
for param in model.parameters():
param.requires_grad = False
# final layer is not frozen
model.fc = nn.Linear(in_features=1024, out_features=num_classes)
model.to(device) model.to(device)
# Loss and optimizer # Loss and optimizer
criterion = nn.CrossEntropyLoss() criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate) optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=1e-5)
# Train Network # Train Network
for epoch in range(num_epochs): for epoch in range(num_epochs):

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
#!/bin/sh
wget https://www.kaggle.com/datasets/e1cd22253a9b23b073794872bf565648ddbe4f17e7fa9e74766ad3707141adeb/download?datasetVersionNumber=1

View File

@@ -1,3 +1,15 @@
"""
Introductory tutorial on how to deal with custom text datasets in PyTorch.
Note that there are better ways to do this when dealing with huge text datasets.
But this is a good way of understanding how it works and can be used as a starting
point, particularly for smaller/medium datasets.
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-04-09 Initial coding
* 2022-12-19 Updated comments, minor code revision, and checked code still works with latest PyTorch.
"""
import os # when loading file paths import os # when loading file paths
import pandas as pd # for lookup in annotation file import pandas as pd # for lookup in annotation file
import spacy # for tokenizer import spacy # for tokenizer
@@ -15,8 +27,8 @@ import torchvision.transforms as transforms
# of same seq_len and setup dataloader) # of same seq_len and setup dataloader)
# Note that loading the image is very easy compared to the text! # Note that loading the image is very easy compared to the text!
# Download with: python -m spacy download en # Download with: python -m spacy download en_core_web_sm
spacy_eng = spacy.load("en") spacy_eng = spacy.load("en_core_web_sm")
class Vocabulary: class Vocabulary:
@@ -130,7 +142,10 @@ def get_loader(
if __name__ == "__main__": if __name__ == "__main__":
transform = transforms.Compose( transform = transforms.Compose(
[transforms.Resize((224, 224)), transforms.ToTensor(),] [
transforms.Resize((224, 224)),
transforms.ToTensor(),
]
) )
loader, dataset = get_loader( loader, dataset = get_loader(

View File

@@ -0,0 +1,190 @@
"""
Simple pytorch lightning example
"""
# Imports
import torch
import torch.nn.functional as F # Parameterless functions, like (some) activation functions
import torchvision.datasets as datasets # Standard datasets
import torchvision.transforms as transforms # Transformations we can perform on our dataset for augmentation
from torch import optim # For optimizers like SGD, Adam, etc.
from torch import nn # All neural network modules
from torch.utils.data import (
DataLoader,
) # Gives easier dataset managment by creating mini batches etc.
from tqdm import tqdm # For nice progress bar!
import pytorch_lightning as pl
import torchmetrics
from pytorch_lightning.callbacks import Callback, EarlyStopping
precision = "medium"
torch.set_float32_matmul_precision(precision)
criterion = nn.CrossEntropyLoss()
## use 20% of training data for validation
# train_set_size = int(len(train_dataset) * 0.8)
# valid_set_size = len(train_dataset) - train_set_size
#
## split the train set into two
# seed = torch.Generator().manual_seed(42)
# train_dataset, val_dataset = torch.utils.data.random_split(
# train_dataset, [train_set_size, valid_set_size], generator=seed
# )
class CNNLightning(pl.LightningModule):
def __init__(self, lr=3e-4, in_channels=1, num_classes=10):
super().__init__()
self.lr = lr
self.train_acc = torchmetrics.Accuracy(task="multiclass", num_classes=10)
self.test_acc = torchmetrics.Accuracy(task="multiclass", num_classes=10)
self.conv1 = nn.Conv2d(
in_channels=in_channels,
out_channels=8,
kernel_size=3,
stride=1,
padding=1,
)
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
self.conv2 = nn.Conv2d(
in_channels=8,
out_channels=16,
kernel_size=3,
stride=1,
padding=1,
)
self.fc1 = nn.Linear(16 * 7 * 7, num_classes)
self.lr = lr
def training_step(self, batch, batch_idx):
x, y = batch
y_hat = self._common_step(x, batch_idx)
loss = criterion(y_hat, y)
accuracy = self.train_acc(y_hat, y)
self.log(
"train_acc_step",
self.train_acc,
on_step=True,
on_epoch=False,
prog_bar=True,
)
return loss
def training_epoch_end(self, outputs):
self.train_acc.reset()
def test_step(self, batch, batch_idx):
x, y = batch
y_hat = self._common_step(x, batch_idx)
loss = F.cross_entropy(y_hat, y)
accuracy = self.test_acc(y_hat, y)
self.log("test_loss", loss, on_step=True)
self.log("test_acc", accuracy, on_step=True)
def validation_step(self, batch, batch_idx):
x, y = batch
y_hat = self._common_step(x, batch_idx)
loss = F.cross_entropy(y_hat, y)
accuracy = self.test_acc(y_hat, y)
self.log("val_loss", loss, on_step=True)
self.log("val_acc", accuracy, on_step=True)
def predict_step(self, batch, batch_idx):
x, y = batch
y_hat = self._common_step(x)
return y_hat
def _common_step(self, x, batch_idx):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.reshape(x.shape[0], -1)
y_hat = self.fc1(x)
return y_hat
def configure_optimizers(self):
optimizer = optim.Adam(self.parameters(), lr=self.lr)
return optimizer
class MNISTDataModule(pl.LightningDataModule):
def __init__(self, batch_size=512):
super().__init__()
self.batch_size = batch_size
def setup(self, stage):
mnist_full = train_dataset = datasets.MNIST(
root="dataset/", train=True, transform=transforms.ToTensor(), download=True
)
self.mnist_test = datasets.MNIST(
root="dataset/", train=False, transform=transforms.ToTensor(), download=True
)
self.mnist_train, self.mnist_val = torch.utils.data.random_split(
mnist_full, [55000, 5000]
)
def train_dataloader(self):
return DataLoader(
self.mnist_train,
batch_size=self.batch_size,
num_workers=6,
shuffle=True,
)
def val_dataloader(self):
return DataLoader(
self.mnist_val, batch_size=self.batch_size, num_workers=2, shuffle=False
)
def test_dataloader(self):
return DataLoader(
self.mnist_test, batch_size=self.batch_size, num_workers=2, shuffle=False
)
class MyPrintingCallback(Callback):
def on_train_start(self, trainer, pl_module):
print("Training is starting")
def on_train_end(self, trainer, pl_module):
print("Training is ending")
# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Load Data
if __name__ == "__main__":
# Initialize network
model_lightning = CNNLightning()
trainer = pl.Trainer(
#fast_dev_run=True,
# overfit_batches=3,
max_epochs=5,
precision=16,
accelerator="gpu",
devices=[0,1],
callbacks=[EarlyStopping(monitor="val_loss", mode="min")],
auto_lr_find=True,
enable_model_summary=True,
profiler="simple",
strategy="deepspeed_stage_1",
# accumulate_grad_batches=2,
# auto_scale_batch_size="binsearch",
# log_every_n_steps=1,
)
dm = MNISTDataModule()
# trainer tune first to find best batch size and lr
trainer.tune(model_lightning, dm)
trainer.fit(
model=model_lightning,
datamodule=dm,
)
# test model on test loader from LightningDataModule
trainer.test(model=model_lightning, datamodule=dm)

View File

@@ -1,15 +1,17 @@
""" """
Example code of a simple bidirectional LSTM on the MNIST dataset. Example code of a simple bidirectional LSTM on the MNIST dataset.
Note that using RNNs on image data is not the best idea, but it is a
good example to show how to use RNNs that still generalizes to other tasks.
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com> Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-05-09 Initial coding * 2020-05-09 Initial coding
* 2022-12-16 Updated with more detailed comments, docstrings to functions, and checked code still functions as intended.
""" """
# Imports # Imports
import torch import torch
import torchvision
import torch.nn as nn # All neural network modules, nn.Linear, nn.Conv2d, BatchNorm, Loss functions import torch.nn as nn # All neural network modules, nn.Linear, nn.Conv2d, BatchNorm, Loss functions
import torch.optim as optim # For all Optimization algorithms, SGD, Adam, etc. import torch.optim as optim # For all Optimization algorithms, SGD, Adam, etc.
import torch.nn.functional as F # All functions that don't have any parameters import torch.nn.functional as F # All functions that don't have any parameters
@@ -18,9 +20,10 @@ from torch.utils.data import (
) # Gives easier dataset managment and creates mini batches ) # Gives easier dataset managment and creates mini batches
import torchvision.datasets as datasets # Has standard datasets we can import in a nice way import torchvision.datasets as datasets # Has standard datasets we can import in a nice way
import torchvision.transforms as transforms # Transformations we can perform on our dataset import torchvision.transforms as transforms # Transformations we can perform on our dataset
from tqdm import tqdm # progress bar
# Set device # Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") device = "cuda" if torch.cuda.is_available() else "cpu"
# Hyperparameters # Hyperparameters
input_size = 28 input_size = 28
@@ -28,7 +31,7 @@ sequence_length = 28
num_layers = 2 num_layers = 2
hidden_size = 256 hidden_size = 256
num_classes = 10 num_classes = 10
learning_rate = 0.001 learning_rate = 3e-4
batch_size = 64 batch_size = 64
num_epochs = 2 num_epochs = 2
@@ -47,7 +50,7 @@ class BRNN(nn.Module):
h0 = torch.zeros(self.num_layers * 2, x.size(0), self.hidden_size).to(device) h0 = torch.zeros(self.num_layers * 2, x.size(0), self.hidden_size).to(device)
c0 = torch.zeros(self.num_layers * 2, x.size(0), self.hidden_size).to(device) c0 = torch.zeros(self.num_layers * 2, x.size(0), self.hidden_size).to(device)
out, _ = self.lstm(x, (h0, c0)) out, _ = self.lstm(x)
out = self.fc(out[:, -1, :]) out = self.fc(out[:, -1, :])
return out return out
@@ -74,7 +77,7 @@ optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Train Network # Train Network
for epoch in range(num_epochs): for epoch in range(num_epochs):
for batch_idx, (data, targets) in enumerate(train_loader): for batch_idx, (data, targets) in enumerate(tqdm(train_loader)):
# Get data to cuda if possible # Get data to cuda if possible
data = data.to(device=device).squeeze(1) data = data.to(device=device).squeeze(1)
targets = targets.to(device=device) targets = targets.to(device=device)
@@ -90,9 +93,8 @@ for epoch in range(num_epochs):
# gradient descent or adam step # gradient descent or adam step
optimizer.step() optimizer.step()
# Check accuracy on training & test to see how good our model # Check accuracy on training & test to see how good our model
def check_accuracy(loader, model): def check_accuracy(loader, model):
if loader.dataset.train: if loader.dataset.train:
print("Checking accuracy on training data") print("Checking accuracy on training data")

View File

@@ -1,12 +1,16 @@
""" """
Example code of how to initialize weights for a simple CNN network. Example code of how to initialize weights for a simple CNN network.
Usually this is not needed as default initialization is usually good,
but sometimes it can be useful to initialize weights in a specific way.
This way of doing it should generalize to other network types just make
sure to specify and change the modules you wish to modify.
Video explanation: https://youtu.be/xWQ-p_o0Uik Video explanation: https://youtu.be/xWQ-p_o0Uik
Got any questions leave a comment on youtube :) Got any questions leave a comment on youtube :)
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com> Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-04-10 Initial coding * 2020-04-10 Initial coding
* 2022-12-16 Updated with more detailed comments, and checked code still functions as intended.
""" """
# Imports # Imports
@@ -20,17 +24,17 @@ class CNN(nn.Module):
self.conv1 = nn.Conv2d( self.conv1 = nn.Conv2d(
in_channels=in_channels, in_channels=in_channels,
out_channels=6, out_channels=6,
kernel_size=(3, 3), kernel_size=3,
stride=(1, 1), stride=1,
padding=(1, 1), padding=1,
) )
self.pool = nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2)) self.pool = nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2))
self.conv2 = nn.Conv2d( self.conv2 = nn.Conv2d(
in_channels=6, in_channels=6,
out_channels=16, out_channels=16,
kernel_size=(3, 3), kernel_size=3,
stride=(1, 1), stride=1,
padding=(1, 1), padding=1,
) )
self.fc1 = nn.Linear(16 * 7 * 7, num_classes) self.fc1 = nn.Linear(16 * 7 * 7, num_classes)
self.initialize_weights() self.initialize_weights()

View File

@@ -9,7 +9,8 @@ Video explanation of code & how to save and load model: https://youtu.be/g6kQl_E
Got any questions leave a comment on youtube :) Got any questions leave a comment on youtube :)
Coded by Aladdin Persson <aladdin dot person at hotmail dot com> Coded by Aladdin Persson <aladdin dot person at hotmail dot com>
- 2020-04-07 Initial programming * 2020-04-07 Initial programming
* 2022-12-16 Updated with more detailed comments, and checked code still functions as intended.
""" """
@@ -39,7 +40,9 @@ def load_checkpoint(checkpoint, model, optimizer):
def main(): def main():
# Initialize network # Initialize network
model = torchvision.models.vgg16(pretrained=False) model = torchvision.models.vgg16(
weights=None
) # pretrained=False deprecated, use weights instead
optimizer = optim.Adam(model.parameters()) optimizer = optim.Adam(model.parameters())
checkpoint = {"state_dict": model.state_dict(), "optimizer": optimizer.state_dict()} checkpoint = {"state_dict": model.state_dict(), "optimizer": optimizer.state_dict()}

View File

@@ -3,13 +3,12 @@ Example code of how to use a learning rate scheduler simple, in this
case with a (very) small and simple Feedforward Network training on MNIST case with a (very) small and simple Feedforward Network training on MNIST
dataset with a learning rate scheduler. In this case ReduceLROnPlateau dataset with a learning rate scheduler. In this case ReduceLROnPlateau
scheduler is used, but can easily be changed to any of the other schedulers scheduler is used, but can easily be changed to any of the other schedulers
available. available. I think simply reducing LR by 1/10 or so, when loss plateaus is
a good default.
Video explanation: https://youtu.be/P31hB37g4Ak
Got any questions leave a comment on youtube :)
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com> Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-04-10 Initial programming * 2020-04-10 Initial programming
* 2022-12-19 Updated comments, made sure it works with latest PyTorch
""" """
@@ -28,7 +27,9 @@ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Hyperparameters # Hyperparameters
num_classes = 10 num_classes = 10
learning_rate = 0.1 learning_rate = (
0.1 # way too high learning rate, but we want to see the scheduler in action
)
batch_size = 128 batch_size = 128
num_epochs = 100 num_epochs = 100
@@ -47,7 +48,7 @@ optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Define Scheduler # Define Scheduler
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau( scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
optimizer, factor=0.1, patience=5, verbose=True optimizer, factor=0.1, patience=10, verbose=True
) )
# Train Network # Train Network
@@ -67,19 +68,19 @@ for epoch in range(1, num_epochs):
losses.append(loss.item()) losses.append(loss.item())
# backward # backward
loss.backward()
# gradient descent or adam step
# scheduler.step(loss)
optimizer.step()
optimizer.zero_grad() optimizer.zero_grad()
loss.backward()
optimizer.step()
mean_loss = sum(losses) / len(losses) mean_loss = sum(losses) / len(losses)
mean_loss = round(mean_loss, 2) # we should see difference in loss at 2 decimals
# After each epoch do scheduler.step, note in this scheduler we need to send # After each epoch do scheduler.step, note in this scheduler we need to send
# in loss for that epoch! # in loss for that epoch! This can also be set using validation loss, and also
# in the forward loop we can do on our batch but then we might need to modify
# the patience parameter
scheduler.step(mean_loss) scheduler.step(mean_loss)
print(f"Cost at epoch {epoch} is {mean_loss}") print(f"Average loss for epoch {epoch} was {mean_loss}")
# Check accuracy on training & test to see how good our model # Check accuracy on training & test to see how good our model
def check_accuracy(loader, model): def check_accuracy(loader, model):
@@ -90,6 +91,7 @@ def check_accuracy(loader, model):
with torch.no_grad(): with torch.no_grad():
for x, y in loader: for x, y in loader:
x = x.to(device=device) x = x.to(device=device)
x = x.reshape(x.shape[0], -1)
y = y.to(device=device) y = y.to(device=device)
scores = model(x) scores = model(x)

View File

@@ -1,9 +1,23 @@
"""
Example code of how to use mixed precision training with PyTorch. In this
case with a (very) small and simple CNN training on MNIST dataset. This
example is based on the official PyTorch documentation on mixed precision
training.
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-04-10 Initial programming
* 2022-12-19 Updated comments, made sure it works with latest PyTorch
"""
# Imports # Imports
import torch import torch
import torch.nn as nn # All neural network modules, nn.Linear, nn.Conv2d, BatchNorm, Loss functions import torch.nn as nn # All neural network modules, nn.Linear, nn.Conv2d, BatchNorm, Loss functions
import torch.optim as optim # For all Optimization algorithms, SGD, Adam, etc. import torch.optim as optim # For all Optimization algorithms, SGD, Adam, etc.
import torch.nn.functional as F # All functions that don't have any parameters import torch.nn.functional as F # All functions that don't have any parameters
from torch.utils.data import DataLoader # Gives easier dataset managment and creates mini batches from torch.utils.data import (
DataLoader,
) # Gives easier dataset managment and creates mini batches
import torchvision.datasets as datasets # Has standard datasets we can import in a nice way import torchvision.datasets as datasets # Has standard datasets we can import in a nice way
import torchvision.transforms as transforms # Transformations we can perform on our dataset import torchvision.transforms as transforms # Transformations we can perform on our dataset
@@ -12,9 +26,21 @@ import torchvision.transforms as transforms # Transformations we can perform on
class CNN(nn.Module): class CNN(nn.Module):
def __init__(self, in_channels=1, num_classes=10): def __init__(self, in_channels=1, num_classes=10):
super(CNN, self).__init__() super(CNN, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=420, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) self.conv1 = nn.Conv2d(
in_channels=1,
out_channels=420,
kernel_size=(3, 3),
stride=(1, 1),
padding=(1, 1),
)
self.pool = nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2)) self.pool = nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2))
self.conv2 = nn.Conv2d(in_channels=420, out_channels=1000, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) self.conv2 = nn.Conv2d(
in_channels=420,
out_channels=1000,
kernel_size=(3, 3),
stride=(1, 1),
padding=(1, 1),
)
self.fc1 = nn.Linear(1000 * 7 * 7, num_classes) self.fc1 = nn.Linear(1000 * 7 * 7, num_classes)
def forward(self, x): def forward(self, x):
@@ -29,19 +55,24 @@ class CNN(nn.Module):
# Set device # Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
assert device == "cuda", "GPU not available"
# Hyperparameters # Hyperparameters
in_channel = 1 in_channel = 1
num_classes = 10 num_classes = 10
learning_rate = 0.001 learning_rate = 3e-4
batch_size = 100 batch_size = 100
num_epochs = 5 num_epochs = 5
# Load Data # Load Data
train_dataset = datasets.MNIST(root='dataset/', train=True, transform=transforms.ToTensor(), download=True) train_dataset = datasets.MNIST(
root="dataset/", train=True, transform=transforms.ToTensor(), download=True
)
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True) train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_dataset = datasets.MNIST(root='dataset/', train=False, transform=transforms.ToTensor(), download=True) test_dataset = datasets.MNIST(
root="dataset/", train=False, transform=transforms.ToTensor(), download=True
)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True) test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True)
# Initialize network # Initialize network
@@ -74,7 +105,6 @@ for epoch in range(num_epochs):
# Check accuracy on training & test to see how good our model # Check accuracy on training & test to see how good our model
def check_accuracy(loader, model): def check_accuracy(loader, model):
num_correct = 0 num_correct = 0
num_samples = 0 num_samples = 0
@@ -90,7 +120,9 @@ def check_accuracy(loader, model):
num_correct += (predictions == y).sum() num_correct += (predictions == y).sum()
num_samples += predictions.size(0) num_samples += predictions.size(0)
print(f'Got {num_correct} / {num_samples} with accuracy {float(num_correct) / float(num_samples) * 100:.2f}') print(
f"Got {num_correct} / {num_samples} with accuracy {float(num_correct) / float(num_samples) * 100:.2f}"
)
model.train() model.train()

View File

@@ -3,11 +3,9 @@ Shows a small example of how to load a pretrain model (VGG16) from PyTorch,
and modifies this to train on the CIFAR10 dataset. The same method generalizes and modifies this to train on the CIFAR10 dataset. The same method generalizes
well to other datasets, but the modifications to the network may need to be changed. well to other datasets, but the modifications to the network may need to be changed.
Video explanation: https://youtu.be/U4bHxEhMGNk
Got any questions leave a comment on youtube :)
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com> Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-04-08 Initial coding * 2020-04-08 Initial coding
* 2022-12-19 Updated comments, minor code changes, made sure it works with latest PyTorch
""" """
@@ -22,8 +20,8 @@ from torch.utils.data import (
) # Gives easier dataset managment and creates mini batches ) # Gives easier dataset managment and creates mini batches
import torchvision.datasets as datasets # Has standard datasets we can import in a nice way import torchvision.datasets as datasets # Has standard datasets we can import in a nice way
import torchvision.transforms as transforms # Transformations we can perform on our dataset import torchvision.transforms as transforms # Transformations we can perform on our dataset
from tqdm import tqdm
# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Hyperparameters # Hyperparameters
@@ -32,17 +30,8 @@ learning_rate = 1e-3
batch_size = 1024 batch_size = 1024
num_epochs = 5 num_epochs = 5
# Simple Identity class that let's input pass without changes
class Identity(nn.Module):
def __init__(self):
super(Identity, self).__init__()
def forward(self, x):
return x
# Load pretrain model & modify it # Load pretrain model & modify it
model = torchvision.models.vgg16(pretrained=True) model = torchvision.models.vgg16(weights="DEFAULT")
# If you want to do finetuning then set requires_grad = False # If you want to do finetuning then set requires_grad = False
# Remove these two lines if you want to train entire model, # Remove these two lines if you want to train entire model,
@@ -50,7 +39,7 @@ model = torchvision.models.vgg16(pretrained=True)
for param in model.parameters(): for param in model.parameters():
param.requires_grad = False param.requires_grad = False
model.avgpool = Identity() model.avgpool = nn.Identity()
model.classifier = nn.Sequential( model.classifier = nn.Sequential(
nn.Linear(512, 100), nn.ReLU(), nn.Linear(100, num_classes) nn.Linear(512, 100), nn.ReLU(), nn.Linear(100, num_classes)
) )
@@ -71,7 +60,7 @@ optimizer = optim.Adam(model.parameters(), lr=learning_rate)
for epoch in range(num_epochs): for epoch in range(num_epochs):
losses = [] losses = []
for batch_idx, (data, targets) in enumerate(train_loader): for batch_idx, (data, targets) in enumerate(tqdm(train_loader)):
# Get data to cuda if possible # Get data to cuda if possible
data = data.to(device=device) data = data.to(device=device)
targets = targets.to(device=device) targets = targets.to(device=device)

View File

@@ -1,11 +1,18 @@
"""
Example code of how to set progress bar using tqdm that is very efficient and nicely looking.
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-05-09 Initial coding
* 2022-12-19 Updated with more detailed comments, and checked code works with latest PyTorch.
"""
import torch import torch
import torch.nn as nn import torch.nn as nn
from tqdm import tqdm from tqdm import tqdm
from torch.utils.data import TensorDataset, DataLoader from torch.utils.data import TensorDataset, DataLoader
# Create a simple toy dataset example, normally this # Create a simple toy dataset
# would be doing custom class with __getitem__ etc,
# which we have done in custom dataset tutorials
x = torch.randn((1000, 3, 224, 224)) x = torch.randn((1000, 3, 224, 224))
y = torch.randint(low=0, high=10, size=(1000, 1)) y = torch.randint(low=0, high=10, size=(1000, 1))
ds = TensorDataset(x, y) ds = TensorDataset(x, y)
@@ -13,12 +20,12 @@ loader = DataLoader(ds, batch_size=8)
model = nn.Sequential( model = nn.Sequential(
nn.Conv2d(3, 10, kernel_size=3, padding=1, stride=1), nn.Conv2d(in_channels=3, out_channels=10, kernel_size=3, padding=1, stride=1),
nn.Flatten(), nn.Flatten(),
nn.Linear(10*224*224, 10), nn.Linear(10 * 224 * 224, 10),
) )
NUM_EPOCHS = 100 NUM_EPOCHS = 10
for epoch in range(NUM_EPOCHS): for epoch in range(NUM_EPOCHS):
loop = tqdm(loader) loop = tqdm(loader)
for idx, (x, y) in enumerate(loop): for idx, (x, y) in enumerate(loop):
@@ -35,7 +42,3 @@ for epoch in range(NUM_EPOCHS):
loop.set_postfix(loss=torch.rand(1).item(), acc=torch.rand(1).item()) loop.set_postfix(loss=torch.rand(1).item(), acc=torch.rand(1).item())
# There you go. Hope it was useful :) # There you go. Hope it was useful :)

View File

@@ -3,23 +3,24 @@ Example code of a simple RNN, GRU, LSTM on the MNIST dataset.
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com> Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-05-09 Initial coding * 2020-05-09 Initial coding
* 2022-12-16 Updated with more detailed comments, docstrings to functions, and checked code still functions as intended.
""" """
# Imports # Imports
import torch import torch
import torchvision import torch.nn.functional as F # Parameterless functions, like (some) activation functions
import torch.nn as nn # All neural network modules, nn.Linear, nn.Conv2d, BatchNorm, Loss functions import torchvision.datasets as datasets # Standard datasets
import torch.optim as optim # For all Optimization algorithms, SGD, Adam, etc. import torchvision.transforms as transforms # Transformations we can perform on our dataset for augmentation
import torch.nn.functional as F # All functions that don't have any parameters from torch import optim # For optimizers like SGD, Adam, etc.
from torch import nn # All neural network modules
from torch.utils.data import ( from torch.utils.data import (
DataLoader, DataLoader,
) # Gives easier dataset managment and creates mini batches ) # Gives easier dataset managment by creating mini batches etc.
import torchvision.datasets as datasets # Has standard datasets we can import in a nice way from tqdm import tqdm # For a nice progress bar!
import torchvision.transforms as transforms # Transformations we can perform on our dataset
# Set device # Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") device = "cuda" if torch.cuda.is_available() else "cpu"
# Hyperparameters # Hyperparameters
input_size = 28 input_size = 28
@@ -29,7 +30,7 @@ num_classes = 10
sequence_length = 28 sequence_length = 28
learning_rate = 0.005 learning_rate = 0.005
batch_size = 64 batch_size = 64
num_epochs = 2 num_epochs = 3
# Recurrent neural network (many-to-one) # Recurrent neural network (many-to-one)
class RNN(nn.Module): class RNN(nn.Module):
@@ -104,15 +105,13 @@ class RNN_LSTM(nn.Module):
train_dataset = datasets.MNIST( train_dataset = datasets.MNIST(
root="dataset/", train=True, transform=transforms.ToTensor(), download=True root="dataset/", train=True, transform=transforms.ToTensor(), download=True
) )
test_dataset = datasets.MNIST( test_dataset = datasets.MNIST(
root="dataset/", train=False, transform=transforms.ToTensor(), download=True root="dataset/", train=False, transform=transforms.ToTensor(), download=True
) )
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True) train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True) test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True)
# Initialize network # Initialize network (try out just using simple RNN, or GRU, and then compare with LSTM)
model = RNN_LSTM(input_size, hidden_size, num_layers, num_classes).to(device) model = RNN_LSTM(input_size, hidden_size, num_layers, num_classes).to(device)
# Loss and optimizer # Loss and optimizer
@@ -121,7 +120,7 @@ optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Train Network # Train Network
for epoch in range(num_epochs): for epoch in range(num_epochs):
for batch_idx, (data, targets) in enumerate(train_loader): for batch_idx, (data, targets) in enumerate(tqdm(train_loader)):
# Get data to cuda if possible # Get data to cuda if possible
data = data.to(device=device).squeeze(1) data = data.to(device=device).squeeze(1)
targets = targets.to(device=device) targets = targets.to(device=device)
@@ -134,16 +133,11 @@ for epoch in range(num_epochs):
optimizer.zero_grad() optimizer.zero_grad()
loss.backward() loss.backward()
# gradient descent or adam step # gradient descent update step/adam step
optimizer.step() optimizer.step()
# Check accuracy on training & test to see how good our model # Check accuracy on training & test to see how good our model
def check_accuracy(loader, model): def check_accuracy(loader, model):
if loader.dataset.train:
print("Checking accuracy on training data")
else:
print("Checking accuracy on test data")
num_correct = 0 num_correct = 0
num_samples = 0 num_samples = 0
@@ -160,13 +154,10 @@ def check_accuracy(loader, model):
num_correct += (predictions == y).sum() num_correct += (predictions == y).sum()
num_samples += predictions.size(0) num_samples += predictions.size(0)
print( # Toggle model back to train
f"Got {num_correct} / {num_samples} with \
accuracy {float(num_correct)/float(num_samples)*100:.2f}"
)
# Set model back to train
model.train() model.train()
return num_correct / num_samples
check_accuracy(train_loader, model) print(f"Accuracy on training set: {check_accuracy(train_loader, model)*100:2f}")
check_accuracy(test_loader, model) print(f"Accuracy on test set: {check_accuracy(test_loader, model)*100:.2f}")

View File

@@ -1,46 +1,47 @@
""" """
Example code of a simple CNN network training on MNIST dataset. A simple walkthrough of how to code a convolutional neural network (CNN)
The code is intended to show how to create a CNN network as well using the PyTorch library. For demonstration we train it on the very
as how to initialize loss, optimizer, etc. in a simple way to get common MNIST dataset of handwritten digits. In this code we go through
training to work with function that checks accuracy as well. how to create the network as well as initialize a loss function, optimizer,
check accuracy and more.
Video explanation: https://youtu.be/wnK3uWv_WkU Programmed by Aladdin Persson
Got any questions leave a comment on youtube :) * 2020-04-08: Initial coding
* 2021-03-24: More detailed comments and small revision of the code
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com> * 2022-12-19: Small revision of code, checked that it works with latest PyTorch version
* 2020-04-08 Initial coding
""" """
# Imports # Imports
import torch import torch
import torch.nn as nn # All neural network modules, nn.Linear, nn.Conv2d, BatchNorm, Loss functions import torch.nn.functional as F # Parameterless functions, like (some) activation functions
import torch.optim as optim # For all Optimization algorithms, SGD, Adam, etc. import torchvision.datasets as datasets # Standard datasets
import torch.nn.functional as F # All functions that don't have any parameters import torchvision.transforms as transforms # Transformations we can perform on our dataset for augmentation
from torch import optim # For optimizers like SGD, Adam, etc.
from torch import nn # All neural network modules
from torch.utils.data import ( from torch.utils.data import (
DataLoader, DataLoader,
) # Gives easier dataset managment and creates mini batches ) # Gives easier dataset managment by creating mini batches etc.
import torchvision.datasets as datasets # Has standard datasets we can import in a nice way from tqdm import tqdm # For nice progress bar!
import torchvision.transforms as transforms # Transformations we can perform on our dataset
# Simple CNN # Simple CNN
class CNN(nn.Module): class CNN(nn.Module):
def __init__(self, in_channels=1, num_classes=10): def __init__(self, in_channels=1, num_classes=10):
super(CNN, self).__init__() super(CNN, self).__init__()
self.conv1 = nn.Conv2d( self.conv1 = nn.Conv2d(
in_channels=1, in_channels=in_channels,
out_channels=8, out_channels=8,
kernel_size=(3, 3), kernel_size=3,
stride=(1, 1), stride=1,
padding=(1, 1), padding=1,
) )
self.pool = nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2)) self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
self.conv2 = nn.Conv2d( self.conv2 = nn.Conv2d(
in_channels=8, in_channels=8,
out_channels=16, out_channels=16,
kernel_size=(3, 3), kernel_size=3,
stride=(1, 1), stride=1,
padding=(1, 1), padding=1,
) )
self.fc1 = nn.Linear(16 * 7 * 7, num_classes) self.fc1 = nn.Linear(16 * 7 * 7, num_classes)
@@ -51,7 +52,6 @@ class CNN(nn.Module):
x = self.pool(x) x = self.pool(x)
x = x.reshape(x.shape[0], -1) x = x.reshape(x.shape[0], -1)
x = self.fc1(x) x = self.fc1(x)
return x return x
@@ -59,24 +59,24 @@ class CNN(nn.Module):
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Hyperparameters # Hyperparameters
in_channel = 1 in_channels = 1
num_classes = 10 num_classes = 10
learning_rate = 0.001 learning_rate = 3e-4 # karpathy's constant
batch_size = 64 batch_size = 64
num_epochs = 5 num_epochs = 3
# Load Data # Load Data
train_dataset = datasets.MNIST( train_dataset = datasets.MNIST(
root="dataset/", train=True, transform=transforms.ToTensor(), download=True root="dataset/", train=True, transform=transforms.ToTensor(), download=True
) )
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_dataset = datasets.MNIST( test_dataset = datasets.MNIST(
root="dataset/", train=False, transform=transforms.ToTensor(), download=True root="dataset/", train=False, transform=transforms.ToTensor(), download=True
) )
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True) test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True)
# Initialize network # Initialize network
model = CNN().to(device) model = CNN(in_channels=in_channels, num_classes=num_classes).to(device)
# Loss and optimizer # Loss and optimizer
criterion = nn.CrossEntropyLoss() criterion = nn.CrossEntropyLoss()
@@ -84,7 +84,7 @@ optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Train Network # Train Network
for epoch in range(num_epochs): for epoch in range(num_epochs):
for batch_idx, (data, targets) in enumerate(train_loader): for batch_idx, (data, targets) in enumerate(tqdm(train_loader)):
# Get data to cuda if possible # Get data to cuda if possible
data = data.to(device=device) data = data.to(device=device)
targets = targets.to(device=device) targets = targets.to(device=device)
@@ -101,14 +101,7 @@ for epoch in range(num_epochs):
optimizer.step() optimizer.step()
# Check accuracy on training & test to see how good our model # Check accuracy on training & test to see how good our model
def check_accuracy(loader, model): def check_accuracy(loader, model):
if loader.dataset.train:
print("Checking accuracy on training data")
else:
print("Checking accuracy on test data")
num_correct = 0 num_correct = 0
num_samples = 0 num_samples = 0
model.eval() model.eval()
@@ -123,12 +116,9 @@ def check_accuracy(loader, model):
num_correct += (predictions == y).sum() num_correct += (predictions == y).sum()
num_samples += predictions.size(0) num_samples += predictions.size(0)
print(
f"Got {num_correct} / {num_samples} with accuracy {float(num_correct)/float(num_samples)*100:.2f}"
)
model.train() model.train()
return num_correct / num_samples
check_accuracy(train_loader, model) print(f"Accuracy on training set: {check_accuracy(train_loader, model)*100:.2f}")
check_accuracy(test_loader, model) print(f"Accuracy on test set: {check_accuracy(test_loader, model)*100:.2f}")

View File

@@ -1,43 +1,70 @@
""" """
Working code of a simple Fully Connected (FC) network training on MNIST dataset. A simple walkthrough of how to code a fully connected neural network
The code is intended to show how to create a FC network as well using the PyTorch library. For demonstration we train it on the very
as how to initialize loss, optimizer, etc. in a simple way to get common MNIST dataset of handwritten digits. In this code we go through
training to work with function that checks accuracy as well. how to create the network as well as initialize a loss function, optimizer,
check accuracy and more.
Video explanation: https://youtu.be/Jy4wM2X21u0
Got any questions leave a comment on youtube :)
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-04-08 Initial coding
Programmed by Aladdin Persson
* 2020-04-08: Initial coding
* 2021-03-24: Added more detailed comments also removed part of
check_accuracy which would only work specifically on MNIST.
* 2022-09-23: Updated with more detailed comments, docstrings to functions, and checked code still functions as intended.
""" """
# Imports # Imports
import torch import torch
import torchvision import torch.nn.functional as F # Parameterless functions, like (some) activation functions
import torch.nn as nn # All neural network modules, nn.Linear, nn.Conv2d, BatchNorm, Loss functions import torchvision.datasets as datasets # Standard datasets
import torch.optim as optim # For all Optimization algorithms, SGD, Adam, etc. import torchvision.transforms as transforms # Transformations we can perform on our dataset for augmentation
import torch.nn.functional as F # All functions that don't have any parameters from torch import optim # For optimizers like SGD, Adam, etc.
from torch import nn # All neural network modules
from torch.utils.data import ( from torch.utils.data import (
DataLoader, DataLoader,
) # Gives easier dataset managment and creates mini batches ) # Gives easier dataset managment by creating mini batches etc.
import torchvision.datasets as datasets # Has standard datasets we can import in a nice way from tqdm import tqdm # For nice progress bar!
import torchvision.transforms as transforms # Transformations we can perform on our dataset
# Create Fully Connected Network # Here we create our simple neural network. For more details here we are subclassing and
# inheriting from nn.Module, this is the most general way to create your networks and
# allows for more flexibility. I encourage you to also check out nn.Sequential which
# would be easier to use in this scenario but I wanted to show you something that
# "always" works and is a general approach.
class NN(nn.Module): class NN(nn.Module):
def __init__(self, input_size, num_classes): def __init__(self, input_size, num_classes):
"""
Here we define the layers of the network. We create two fully connected layers
Parameters:
input_size: the size of the input, in this case 784 (28x28)
num_classes: the number of classes we want to predict, in this case 10 (0-9)
"""
super(NN, self).__init__() super(NN, self).__init__()
# Our first linear layer take input_size, in this case 784 nodes to 50
# and our second linear layer takes 50 to the num_classes we have, in
# this case 10.
self.fc1 = nn.Linear(input_size, 50) self.fc1 = nn.Linear(input_size, 50)
self.fc2 = nn.Linear(50, num_classes) self.fc2 = nn.Linear(50, num_classes)
def forward(self, x): def forward(self, x):
"""
x here is the mnist images and we run it through fc1, fc2 that we created above.
we also add a ReLU activation function in between and for that (since it has no parameters)
I recommend using nn.functional (F)
Parameters:
x: mnist images
Returns:
out: the output of the network
"""
x = F.relu(self.fc1(x)) x = F.relu(self.fc1(x))
x = self.fc2(x) x = self.fc2(x)
return x return x
# Set device # Set device cuda for GPU if it's available otherwise run on the CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Hyperparameters # Hyperparameters
@@ -45,16 +72,16 @@ input_size = 784
num_classes = 10 num_classes = 10
learning_rate = 0.001 learning_rate = 0.001
batch_size = 64 batch_size = 64
num_epochs = 1 num_epochs = 3
# Load Data # Load Data
train_dataset = datasets.MNIST( train_dataset = datasets.MNIST(
root="dataset/", train=True, transform=transforms.ToTensor(), download=True root="dataset/", train=True, transform=transforms.ToTensor(), download=True
) )
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_dataset = datasets.MNIST( test_dataset = datasets.MNIST(
root="dataset/", train=False, transform=transforms.ToTensor(), download=True root="dataset/", train=False, transform=transforms.ToTensor(), download=True
) )
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True) test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True)
# Initialize network # Initialize network
@@ -66,7 +93,7 @@ optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Train Network # Train Network
for epoch in range(num_epochs): for epoch in range(num_epochs):
for batch_idx, (data, targets) in enumerate(train_loader): for batch_idx, (data, targets) in enumerate(tqdm(train_loader)):
# Get data to cuda if possible # Get data to cuda if possible
data = data.to(device=device) data = data.to(device=device)
targets = targets.to(device=device) targets = targets.to(device=device)
@@ -74,47 +101,64 @@ for epoch in range(num_epochs):
# Get to correct shape # Get to correct shape
data = data.reshape(data.shape[0], -1) data = data.reshape(data.shape[0], -1)
# forward # Forward
scores = model(data) scores = model(data)
loss = criterion(scores, targets) loss = criterion(scores, targets)
# backward # Backward
optimizer.zero_grad() optimizer.zero_grad()
loss.backward() loss.backward()
# gradient descent or adam step # Gradient descent or adam step
optimizer.step() optimizer.step()
# Check accuracy on training & test to see how good our model # Check accuracy on training & test to see how good our model
def check_accuracy(loader, model): def check_accuracy(loader, model):
if loader.dataset.train: """
print("Checking accuracy on training data") Check accuracy of our trained model given a loader and a model
else:
print("Checking accuracy on test data") Parameters:
loader: torch.utils.data.DataLoader
A loader for the dataset you want to check accuracy on
model: nn.Module
The model you want to check accuracy on
Returns:
acc: float
The accuracy of the model on the dataset given by the loader
"""
num_correct = 0 num_correct = 0
num_samples = 0 num_samples = 0
model.eval() model.eval()
# We don't need to keep track of gradients here so we wrap it in torch.no_grad()
with torch.no_grad(): with torch.no_grad():
# Loop through the data
for x, y in loader: for x, y in loader:
# Move data to device
x = x.to(device=device) x = x.to(device=device)
y = y.to(device=device) y = y.to(device=device)
# Get to correct shape
x = x.reshape(x.shape[0], -1) x = x.reshape(x.shape[0], -1)
# Forward pass
scores = model(x) scores = model(x)
_, predictions = scores.max(1) _, predictions = scores.max(1)
# Check how many we got correct
num_correct += (predictions == y).sum() num_correct += (predictions == y).sum()
# Keep track of number of samples
num_samples += predictions.size(0) num_samples += predictions.size(0)
print(
f"Got {num_correct} / {num_samples} with accuracy {float(num_correct)/float(num_samples)*100:.2f}"
)
model.train() model.train()
return num_correct / num_samples
check_accuracy(train_loader, model) # Check accuracy on training & test to see how good our model
check_accuracy(test_loader, model) print(f"Accuracy on training set: {check_accuracy(train_loader, model)*100:.2f}")
print(f"Accuracy on test set: {check_accuracy(test_loader, model)*100:.2f}")

View File

@@ -1,3 +1,13 @@
"""
Code for calculating the mean and standard deviation of a dataset.
This is useful for normalizing the dataset to obtain mean 0, std 1.
Programmed by Aladdin Persson <aladdin.persson at hotmail dot com>
* 2020-05-09 Initial coding
* 2022-12-16 Updated comments, code revision, and checked code still works with latest PyTorch.
"""
import torch import torch
import torchvision.transforms as transforms import torchvision.transforms as transforms
from torch.utils.data import DataLoader from torch.utils.data import DataLoader
@@ -5,20 +15,23 @@ import torchvision.datasets as datasets
from tqdm import tqdm from tqdm import tqdm
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_set = datasets.CIFAR10(root="ds/", transform=transforms.ToTensor(), download=True) train_set = datasets.CIFAR10(
root="dataset/", transform=transforms.ToTensor(), download=True
)
train_loader = DataLoader(dataset=train_set, batch_size=64, shuffle=True) train_loader = DataLoader(dataset=train_set, batch_size=64, shuffle=True)
def get_mean_std(loader): def get_mean_std(loader):
# var[X] = E[X**2] - E[X]**2 # var[X] = E[X**2] - E[X]**2
channels_sum, channels_sqrd_sum, num_batches = 0, 0, 0 channels_sum, channels_sqrd_sum, num_batches = 0, 0, 0
for data, _ in tqdm(loader): for data, _ in tqdm(loader):
channels_sum += torch.mean(data, dim=[0, 2, 3]) channels_sum += torch.mean(data, dim=[0, 2, 3])
channels_sqrd_sum += torch.mean(data ** 2, dim=[0, 2, 3]) channels_sqrd_sum += torch.mean(data**2, dim=[0, 2, 3])
num_batches += 1 num_batches += 1
mean = channels_sum / num_batches mean = channels_sum / num_batches
std = (channels_sqrd_sum / num_batches - mean ** 2) ** 0.5 std = (channels_sqrd_sum / num_batches - mean**2) ** 0.5
return mean, std return mean, std

View File

@@ -11,9 +11,13 @@ But also other things such as setting the device (GPU/CPU) and converting
between different types (int, float etc) and how to convert a tensor to an between different types (int, float etc) and how to convert a tensor to an
numpy array and vice-versa. numpy array and vice-versa.
Programmed by Aladdin Persson
* 2020-06-27: Initial coding
* 2022-12-19: Small revision of code, checked that it works with latest PyTorch version
""" """
import torch import torch
import numpy as np
# ================================================================= # # ================================================================= #
# Initializing Tensor # # Initializing Tensor #
@@ -74,8 +78,6 @@ print(
print(f"Converted float64 {tensor.double()}") # Converted to float64 print(f"Converted float64 {tensor.double()}") # Converted to float64
# Array to Tensor conversion and vice-versa # Array to Tensor conversion and vice-versa
import numpy as np
np_array = np.zeros((5, 5)) np_array = np.zeros((5, 5))
tensor = torch.from_numpy(np_array) tensor = torch.from_numpy(np_array)
np_array_again = ( np_array_again = (
@@ -109,7 +111,7 @@ t += x # Also inplace: t = t + x is not inplace, bit confusing.
# -- Exponentiation (Element wise if vector or matrices) -- # -- Exponentiation (Element wise if vector or matrices) --
z = x.pow(2) # z = [1, 4, 9] z = x.pow(2) # z = [1, 4, 9]
z = x ** 2 # z = [1, 4, 9] z = x**2 # z = [1, 4, 9]
# -- Simple Comparison -- # -- Simple Comparison --
@@ -150,7 +152,7 @@ z = (
x1 - x2 x1 - x2
) # Shape of z is 5x5: How? The 1x5 vector (x2) is subtracted for each row in the 5x5 (x1) ) # Shape of z is 5x5: How? The 1x5 vector (x2) is subtracted for each row in the 5x5 (x1)
z = ( z = (
x1 ** x2 x1**x2
) # Shape of z is 5x5: How? Broadcasting! Element wise exponentiation for every row ) # Shape of z is 5x5: How? Broadcasting! Element wise exponentiation for every row
# Other useful tensor operations # Other useful tensor operations

Some files were not shown because too many files have changed in this diff Show More