The researchers at Georgia Institute of Technology used the technique to gather more than 40,000 pictures taken every 30 to 60 seconds, over a 6 month period, by a wearable camera and predicted with 83 per cent accuracy what activity that person was doing.
Researchers taught the computer to categorise images across 19 activity classes.
The test subject wearing the camera could review and annotate the photos at the end of each day to ensure that they were correctly categorised.
"This work is about developing a better way to understand people's activities, and build systems that can recognise people's activities at a finely-grained level of detail," added co-author Edison Thomaz, graduate research assistant in the School of Interactive Computing.
"Activity tracking devices like the Fitbit can tell how many steps you take per day, but imagine being able to track all of your activities - not just physical activities like walking and running," said Thomaz.
The group believes they have gathered the largest annotated dataset of first-person images to demonstrate that deep-learning can understand human behaviour and the habits of a specific person.
The ability to literally see and recognise human activities has implications in a number of areas - from developing improved personal assistant applications like Siri to helping researchers explain links between health and behaviour, Thomaz said.
Researchers believe that someday within the next decade we will have ubiquitous devices that can improve our personal choices throughout the day.
"Once it builds your own schedule by knowing what you are doing, it might tell you there is a traffic delay and you should leave sooner or take a different route," he said.
