CelebA Dataset

Estimated reading: 6 minutes 7601 views

Visualization of the Celeb-A dataset in the Deep Lake UI

Celeb-A dataset

What is Celeb-A Dataset?

The CelebFaces Attributes Dataset (CelebA) consists of more than 200K celebrity images with 40 attribute annotations each. The images range from extreme poses to heavily background-cluttered backgrounds. Images cover large pose variations, background clutter, and diverse people, making this dataset great for training and testing models for face detection. It can identify people with brown hair, smiling, or wearing glasses.

Download Celeb-A Dataset in Python

Instead of downloading the CelebA dataset in Python, you can effortlessly load it in Python via our Deep Lake open-source with just one line of code.

Load CelebA Dataset Training Subset in Python

				
					import deeplake
ds = deeplake.load("hub://activeloop/celeb-a-train")

Load CelebA Dataset Validation Subset in Python

				
					import deeplake
ds = deeplake.load("hub://activeloop/celeb-a-val")

Load CelebA Dataset Testing Subset in Python

				
					import deeplake
ds = deeplake.load("hub://activeloop/celeb-a-test")

CelebA Dataset Structure

CelebA Data Fields

image: tensor containing the 178×218 image.
bbox: tensor containing bounding box of their respective images.
keypoints: tensor to identify 63 various key points from face
clock_shadow: tensor to check cloak shadow.
arched_eyebrows: tensor to check arch eyebrows.
attractive: tensor to check if attractive or not.
bags_under_eyes: tensor to check if bags are under the eyes.
bald: tensor to check if bald or not.
bangs: tensor to check if bangs are there or not.
big_lips: tensor to check if big lips are there or not.
big_nose: tensor to check if big nose is there or not.
black_hair: tensor to check the presence of black hair.
blond_hair: tensor to check if blond hair or not.
blurry: tensor to check if the image is blurred.
brown_hair: tensor to check the presence of brown hair.
bushy_eyebrows: tensor to check the presence of bushy eyebrows.
chubby: tensor to check if chubby or not.
double_chin: tensor to check the presence of double chin.
eyeglasses: tensor checks the presence of eyebrows.
goatee: tensor to check the presence of a goatee in a person.
gray_hair: tensor to check the presence of gray hair.
heavy_makeup: tensor to check the presence of heavy makeup.
high_cheekbones: tensor to check the presence of high cheekbones.
male: tensor to check if the person is male.
mouth_slightly_open: tensor to check if the mouth is open.
mustache: tensor to check the presence of a mustache.
narrow_eyes: tensor to check narrow eyes or not.
no_beard: tensor to check if the beard is present.
oval_face: tensor to check if the face is oval.
pale_skin: tensor to check if the skin is pale.
pointy_nose: tensor to check if the nose is pointy.
receding_hairline: tensor to check if the hairline is receding.
rosy_cheeks: tensor to check if the cheeks are rosy.
sideburns: tensor to check the presence of sideburns.
smiling: tensor to check if the person is smiling.
straight_hair: tensor to check if the hair is straight.
wavy_hair: tensor to check if the hair is wavy.
wearing_earrings: tensor to check the presence of earing.
wearing_hat: tensor to check the presence of the hat.
wearing_lipstick: tensor to check the presence of lipstick.
wearing_necklace: tensor to check the presence of the necklace.
wearing_necktie: tensor to check the presence of necktie.
young: tensor to check if the person is young.

CelebA Data Splits

The CelebA dataset training set is composed of 162,770.
The CelebA dataset test set was composed of 19,962.
The CelebA dataset val set was composed of 19,867.

How to use CelebA Dataset with PyTorch and TensorFlow in Python

Train a model on CelebA dataset with PyTorch in Python

Let’s use Deep Lake built-in PyTorch one-line dataloader to connect the data to the compute:

				
					dataloader = ds.pytorch(num_workers=0, batch_size=4, shuffle=False)

Train a model on CelebA dataset with TensorFlow in Python

				
					dataloader = ds.tensorflow()

Additional Information about CelebA Dataset

CelebA Dataset Description

Homepage: https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
Repository: N/A
Paper: Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou: Deep Learning Face Attributes in the Wild, Proceedings of International Conference on Computer Vision (ICCV), 2015
Point of Contact: ziwei.liu at ntu.edu.sg

CelebA Dataset Curators

Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou

CelebA Dataset Licensing Information

Deep Lake users may have access to a variety of publicly available datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have a license to use the datasets. It is your responsibility to determine whether you have permission to use the datasets under their license.

If you’re a dataset owner and do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thank you for your contribution to the ML community!

CelebA Dataset Citation Information

				
					@inproceedings{liu2015faceattributes,
  title = {Deep Learning Face Attributes in the Wild},
  author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
  booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
  month = {December},
  year = {2015} 
}

CelebA Dataset FAQs

What is the CelebA dataset for Python?

What is the CelebA dataset used for?

This dataset is great for training and testing models for face detection, particularly for recognizing facial attributes such as finding people with brown hair, smiling, or wearing glasses. Images cover large pose variations, background clutter, and diverse people, supported by a large number of images and rich annotations.

How can I use CelebA dataset in PyTorch or TensorFlow?

You can stream the CelebA dataset while training a model in PyTorch or TensorFlow with one line of code using the open-source package Activeloop Deep Lake in Python. See detailed instructions on how to train a model on the CelebA dataset with PyTorch in Python or train a model on the CelebA dataset with TensorFlow in Python.

CelebA Dataset

Celeb-A dataset

What is Celeb-A Dataset?

Download Celeb-A Dataset in Python

Load CelebA Dataset Training Subset in Python

Load CelebA Dataset Validation Subset in Python

Load CelebA Dataset Testing Subset in Python

CelebA Dataset Structure

CelebA Data Fields

CelebA Data Splits

How to use CelebA Dataset with PyTorch and TensorFlow in Python

Train a model on CelebA dataset with PyTorch in Python

Train a model on CelebA dataset with TensorFlow in Python

Additional Information about CelebA Dataset

CelebA Dataset Description

CelebA Dataset Curators

CelebA Dataset Licensing Information

CelebA Dataset Citation Information

CelebA Dataset FAQs

What is the CelebA dataset for Python?

What is the CelebA dataset used for?

How can I use CelebA dataset in PyTorch or TensorFlow?

CelebA Dataset

CONTENTS