Office-Home Dataset

Estimated reading: 4 minutes 170 views

Visualization of the office-home dataset on the Deep Lake UI

Office-Home Dataset

What is Office-Home Dataset?

The Office-Home dataset was created to assess deep learning algorithms for domain adaptation-based object recognition. The dataset consists of images from 4 different domains which include art, clip art, product, and Real-World images. The dataset contains images of 65 types of objects commonly found in Office-Home Settings.

Download Office-Home Dataset in Python

Instead of downloading the Office-Home Dataset in Python, you can effortlessly load it in Python via our Deep Lake open-source with just one line of code.

Load Office-Home Dataset in Python

					import deeplake
ds = deeplake.load('hub://activeloop/office-home-domain-adaptation')

Office-Home Dataset Structure

Office-Home Data Fields
  • images: tensor containing images
  • domain_objects: labels that represent 65 categories of objects in each domain
  • domain_categories: labels that represent 4 domain categories

How to use Office-Home Dataset with PyTorch and TensorFlow in Python

Train a model on Office-Home Dataset with PyTorch in Python

Let’s use Deep Lake built-in PyTorch one-line dataloader to connect the data to the compute:

					dataloader = ds.pytorch(num_workers = 0, batch_size= 4, shuffle = False)
Train a model on Office-Home Dataset with TensorFlow in Python
					dataloader = ds.tensorflow()

Office-Home Dataset Creation

Data Collection and Normalization Information
Python crawler was used for image collection. There were 100,000 images of 120 different objects. To make sure that the right objects are present in the image, the dataset was cleaned. It was also ensured that each category has a certain number of images. The last version of the dataset has 15,500 images of 65 different objects.

Additional Information about Office-Home Dataset

Office-Home Dataset Description

Office-Home Dataset Curators
Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty and Sethuraman Panchanathan
Office-Home Dataset Licensing Information
More information about the license can be found here. Deep Lake users may have access to a variety of publicly available datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have a license to use the datasets. It is your responsibility to determine whether you have permission to use the datasets under their license. If you’re a dataset owner and do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thank you for your contribution to the ML community!
Office-Home Dataset Citation Information
title={Deep hashing network for unsupervised domain adaptation},
author={Venkateswara, Hemanth and Eusebio, Jose and Chakraborty, Shayok and Panchanathan, Sethuraman}, 
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, 
year={2017} }

Office-Home Dataset FAQs

What is the Office-Home dataset for Python?

The Office-Home dataset was developed to assess domain adaptation algorithms for object recognition using deep learning. The dataset is made up of images from four different domains—artistic, product, real-world images, and clip art. A Python web-crawler that crawled through several search engines and online image directories was used to collect the images in the dataset.

What is the Office-Home dataset used for?

The Office-Home dataset is used as a benchmark dataset for domain adaptation. It contains four domains where each domain consists of 65 categories. The four domains include art (a collection of artistic images in the form of sketches), clipart (a collection of clipart images), product (a domain containing images of objects without a background), and real-world images (a domain containing images of objects captured with a regular camera).

How to download the Office-Home dataset in Python?

With the open-source package Activeloop Deep Lake in Python you can load the Office-Home dataset fast with one line of code. See detailed instructions on how to load the Office-Home dataset in Python.

How can I use Office-Home dataset in PyTorch or TensorFlow?

Using the open-source package Activeloop Deep Lake in Python you can stream the Office-Home dataset while training a model in PyTorch or TensorFlow with one line of code. See detailed instructions on how to train a model on Office-Home dataset with PyTorch in Python or train a model on Office-Home dataset with TensorFlow in Python.