Visual networking refers to an emerging class of user applications that combine digital video and social networking capabilities. It is based upon the premise that visual literacy, "the ability to interpret, negotiate and make meaning from information presented in the form of a moving image", is a powerful force in how humans communicate, entertain and learn. The duality of visual networking—subsuming entertainment and communications, professional and personal content, video and other digital media, data networks and social networks to create immersive experiences, when, where and how the user wants it. These applications have changed video content from long-form movies and broadcast television programming to a database of segments or "clips", and social network annotations. And the generation and distribution of content takes on a new dimension with Web 2.0 applications—participatory social-networks or communities that facilitate interactive creativity, collaboration and sharing between users.


The rise of visual networking is relatively recent phenomenon driven by the emergence of social networking capabilities and the ability to deliver interactive video over a broadband network. It is a natural evolution of the current social networking phenomena whereby social networking annotations are layered over broadband video to create highly interactive and immersive experiences between individuals and their content. Until early 2005 this was not considered viable due to the lack of web and broadband infrastructure designed to support the transmission of web video and the still nascent stage of social networks like MySpace and Facebook. The introduction of YouTube in February 2005 marked the first significant combination of broadband video and social network systems designed to allow users to share, rate and tag user generated and premium content. From 2006 to 2008 this trend continued to gain steam as individuals and businesses pursued new combinations of video and social networking across a wide range of entertainment, communication and learning applications.

Video has largely been defined by its use as an entertainment medium. Since the commercial availability of the television in the late '30s video has become the dominant entertainment medium far eclipsing audio and text based entertainment both in terms of time and dollars spent. Within the past decade, video use has rapidly evolved across a broader range of devices, multiple locations and user applications. The popularization of the long-tail and user-generated video has further challenged people's ideas of what's possible with video. A key advantage of video relative to other media is its superior ability to communicate ideas and emotions economically. If a picture is worth a thousand words, then a video may be worth a thousand pictures. Video by its very nature is highly experiential, making communications more compelling, informative and memorable.

At the core of visual networking is the concept that people can participate in communities of content and communities of interest. A community of interest is defined as a community of people who share a common interest or passion. These people exchange ideas and thoughts about the given passion, but may know (or care) little about each other outside of this area. Participation in a community of interest can be compelling, entertaining and create a ‘sticky’ community where people return frequently and remain for extended periods. The unparalleled potential of the Internet to promote such connections is only now being fully recognized and exploited, through Web-based groups established for that purpose. Based on the six degrees of separation concept (the idea that any two people on the planet could make contact through a chain of no more than five intermediaries), social networking establishes interconnected Internet communities (sometimes known as personal networks) that help people make contacts that would be good for them to know, but that they would be unlikely to have met otherwise.

The phrase The Long Tail was, according to Chris Anderson, first coined by himself in October 2004. Anderson argued that products that are in low demand or have low sales volume can collectively make up a market share that rivals or exceeds the relatively few current bestsellers and blockbusters, if the store or distribution channel is large enough. The Long Tail also has implications for the producers of content; especially those whose products could not—for economic reasons—find a place in pre-Internet information distribution channels controlled by book publishers, record companies, movie studios, and television networks. Looked at from the producers' side, the Long Tail has made possible a flowering of creativity across all fields of human endeavor. One example of this is YouTube, where thousands of diverse videos—whose content, production value or lack of popularity make them inappropriate for traditional television—are easily accessible to a wide range of viewers. The benefit to the consumer is that they know have an almost infinite choice of content to select from able to create their own specific channels based upon their unique needs. A potential negative side effect of the long tail is the rapidly growing inventory of text, audio and video content. The storage and distribution systems of the past restricted the number of songs, video, and books making it easier to search for what was relevant to the individual. As the long-tail has grown, more and more relevant and irrelevant content passes an individual by without their knowledge. This is especially true for video because unlike text-based files which can searched and indexed for easy finding, video typically has only its title as a clue to what's in it. This lack of comprehensive meta-data has limited the applicability of traditional search models. Augmenting traditional search has been the emergence of content based discovery tools that make people aware of relevant content based upon their participation in communities of interest and/or communities of content. The idea is that users may or may not start out searching for something, but they soon begin reacting to things they find, exploring links on pages they stumble upon and taking cues from fellow surfers about where to go. Instead of the old, passive, lean-back style of watching video, viewers are actively seeking content through discovery. People interact with each other, posting comments on what they just saw. Many sites now allow people to vote on videos, ranking and rating them. Ranking is the result of one of a number of algorithms that measure how many people have watched something or how many sites link to it.

YouTube is the best early example of a visual networking experience. YouTube is a video sharing website where users can upload, view and share video clips. Unregistered users can watch most videos on the site, while registered users are permitted to upload an unlimited number of videos. Few statistics are publicly available regarding the number of videos on YouTube. However, in July 2006, the company revealed that more than 100 million videos were being watched every day, and 2.5 billion videos were watched in June 2006. 50,000 videos were being added per day in May 2006, and this increased to 65,000 by July. In January 2008 alone, nearly 79 million users watched over 3 billion videos on YouTube.

Telepresence refers to a set of technologies which allow a person to feel as if they were present, to give the appearance that they were present, or to have an effect, at a location other than their true location. Telepresence requires that the senses of the user, or users, are provided with such stimuli as to give the feeling of being in that other location. Additionally, the user(s) may be given the ability to affect the remote location. In this case, the user's position, movements, actions, voice, etc. may be sensed, transmitted and duplicated in the remote location to bring about this effect. Therefore, information may be traveling in both directions between the user and the remote location. Critical the creating an in-person experience is the presence of high-definition video perfectly synchronized with stereophonic sound. A minimum system usually includes visual feedback. Ideally, the entire field of view of the user is filled with a view of the remote location, and the viewpoint corresponds to the movement and orientation of the user's head. In this way, it differs from television or cinema, where the viewpoint is out of the control of the viewer.

While still in its infancy, visual networking applications are beginning to emerge that span both consumer and business markets.

Proliferation of multi-function mobile devices, particularly those with built-in digital cameras and/or video cameras is making it easier for individuals to share first person photos and videos in real-time with their friends

Interactive television represents a continuum from low interactivity (TV on/off, volume, changing channels) to moderate interactivity (simple movies on demand without player controls) and high interactivity in which, for example, an audience member affects the program being watched. The most obvious example of this would be any kind of real-time voting on the screen, in which audience votes create decisions that are reflected in how the show continues.

