Crosstalk in Stereoscopic Displays – Why 3D Movies Look Weird
17 min read
Remember 3D TVs? Remember how the TV manufacturers tried to ship the idea of wearing goggles at home while watching TV, so you could experience that Avatar-style novelty from the comfort of your own couch? Although the idea 3D television is pretty much dead now, and everyone is more focused on the possibilities of AR and VR, I would not call 3D a dead technology yet. After all, cinemas are still pushing 3D movies and if we eventually crack the method for building glassless 3D televisions we could maybe even see the second coming of 3D at home. 3D has just been forgotten, a faith it has endured once before as well.
This article is not however so much about the future of 3D, but instead about an interesting phenomenon in human perception caused by 3D displays called crosstalk. Many of us have most likely encountered crosstalk at least once in our lives, but due to the
Stereoscopic three-dimensional (3D) screens are generally better at creating a greater immersion with the displayed content, thus enabling richer experiences. The difference in immersion compared to more traditional two-dimensional screens (2D), comes with the help of binocular depth. However, 3D dimensional displays do not actually display content in three-dimensional space, but instead rely on to projection of different image to each of the eyes, they have a possible problem of inducing various visual artifacts that cause discomfort and degradation of the Quality of Experience. One of these visual artifacts is crosstalk, an effect caused by another eye’s half-stereoscopic image bleeding into the other eye. This paper will focus on explaining the causes of crosstalk and how the human visual system understands it by concentrating on literate regarding crosstalk. A couple of different technologies used to create three-dimensionality will be inspected on their relation to crosstalk and the other definitions of crosstalk. This article will also look at studies conducted on measuring crosstalk, the perceived experience of crosstalk on different people, and the different levels of crosstalk. I will conclude this essay by introducing methods and ways of combating crosstalk in 3D displays.
But what is crosstalk then, and what causes it? To understand crosstalk phenomena, we have to first understand how 3D videos are created and how they can be viewed on different display technologies. The basic structure of creating 3D video is to use two cameras when filming a scene either by using two real cameras or by using two artificial cameras, a standard method in digital animation. This leads to having two different recordings of the same scene, but from a slightly different angle. This is mostly the same as how human eyes work, we are actually seeing two different views of the same world, which are eventually combined into a single image through stereopsis. To create the illusion of two different views and the illusion of depth on a display, the display has to show different views on top of each other. The key technologies of 3D displays have to do with combating this problem.
Understanding the technologies used to achieve the 3D effect is vital as crosstalk differs between different technologies. The difference of crosstalk on different technologies is inspected more closely later; however, this is done to a limited degree. One of the most well known 3D technologies is called anaglyphic 3D, which uses color filters in displaying different view to each eye. The filtering is usually achieved by wearing a pair of glasses with red/green filters on them. Another similar but more widely used method is using polarized filters (polarization-multiplexing) to filter the correct image to each eye. This and the former are both method cheap and achievable with passive, no electricity using glasses. There are also active methods, such as using an eclipsing method of blocking the view of the other eye (time-multiplexing), and interference technology of displaying two different images with two different wavelengths of light. Worth mentioning is also the possibility of autostereoscopic displays, which enable stereoscopic viewing without glasses.
Now that we understand the technological context, explanation of crosstalk is more tangible. Crosstalk is essentially produced by imperfect view separation, leading to a proportion of one eye’s image to be seen by the other eye as well (Xing 2012). This definition is good enough for communicating the general concept of crosstalk. However, for scientific conversation, it is much too ill-defined. In the following parts, the term and definition of crosstalk are subject to closer inspection.
Well known and widely used in the literature of stereoscopic displays, crosstalk as an effect is also known by other names and spellings (Woods 2011) such as cross talk, cross-talk, leakage, extinction, extinction ratio, 3D contrast, and even x-talk. In addition to these the term ‘ghosting’ is often used interchangeably, but according to Woods, the first separate definition for ghosting comes from Lipton in 1987 (Lipton 1987, Woods 2011). Lipton’s definition of ghosting and crosstalk are refined in a publication from 2009, where crosstalk is “incomplete isolation of the left and right image channels so that one leaks or bleed into the other” and ghosting is defined as being the subjective perception of crosstalk” (Lipton 2009). Woods also points out a definition made by Huang et al. in which System crosstalk describes crosstalk, while Viewer crosstalk is used to address the subjective experience of it, i.e. ghosting (Woods 2010).
To be used as a metric, crosstalk has to be defined mathematically. Woods et al. mention in their article that there exist two different mathematical definitions for ‘crosstalk ratio’ (Woods 2011). They also point out that unfortunately, there exist several papers quoting crosstalk values, without specifying which definition is used. This serves to strengthen the need for a standardized definition of crosstalk ratio.
Woods et al. define crosstalk in its simplest form as a function of leakage divided by signal multiplied by hundred (Woods 2011), illustrated in the formula below.
In this formula, leakage means the maximum luminance of light that leaks from the unintended channel to the intended channel, and signal means the maximum luminance of the intended channel. The measurements for luminance are achieved by measuring the level of black in the intended channel and the white in the unintended channel, which corresponds with leakage. In the same manner, signal correspondence is achieved by measuring the level of white in the intended channel and the level of black in the unintended channel.
Woods et al. also point out two other mathematical definitions, which agree with this basic definition; however, for the sake of this paper’ scope, those definitions are not presented here. These two definitions can be found from articles by Chu et al. and Hong et al. (Chu 2005, Hong 2010). However because the earlier mathematical description of does not take into consideration the effect of black level, Woods et al. introduce another mathematical definition where the non-zero black level is taken into account by subtracting the black level luminance, all illustrated below.
Different 3D inducing technologies have different levels of crosstalk as crosstalk exists in almost all stereoscopic displays, an exception to this being the ones using Wheatstone and Helmholtz approach, which however do not conform to today’s requirement of flat-panel display and are therefore left out of the scope of this paper (Daly 2011). Nevertheless, crosstalk is a primary concern in developing stereoscopic display systems (Daly 2011).
One technology that is especially prone the crosstalk is anaglyphic 3D, whose main benefits are simplicity and cost. However, anaglyphic 3D’s main disadvantage in addition to crosstalk is its inability to depict full-color images (Woods 2004). Anaglyphic 3D systems are much rarer in these days, as active time- and polarization-multiplexed approaches tend to work better with lesser quality control. The level of filtering in anaglyphic 3D systems can vary considerably between different manufactured glasses, resulting in varying amounts of crosstalk (Woods 2004). Woods et al. list properties that cause crosstalk in the anaglyphic 3D system in their 2004 paper. These are:
- Display spectral response – crosstalk is caused by the display’s primary color band overlap with each other significantly, making it difficult to separate those colors with color filters.
- Anaglyph glasses spectral response – crosstalk is caused by color filters of the anaglyph glasses pass light that is from an undesired domain, for example when passing also higher wavelength light.
- Image compression – crosstalk is caused by image compression formats (e.g. MPEG, JPEG) mixing information between the three RGB color channels.
- Image encoding and transmission – crosstalk is caused by analog consumer video formats (NTSC, PAL) mixing RGB colors channels during encoding.
Crosstalk is also present in other technologies. For example Lipton’s 1987 paper names two crosstalk factors for time-multiplexed stereoscopic systems, which are phosphor decay and the dynamic range of the shutters (Lipton 1987). Phosphor decay is an effect of phosphor after-image in liquid crystal based shutter glasses, this is common with many CRT based displays, where the image generation results in still decaying phosphor projection to leak into the other eye’s view. Another crosstalk factor, according to Lipton is the dynamic range of the glasses, meaning the ratio of the transmission of the glasses’ shutters in their opened state to their closed state. Problems with a dynamic range of shutters lead to wrong image being displayed for the wrong eye (Seuntiëns 2005).
In addition to the presented technologies, there are also other 3D technologies, where crosstalk exist. Similarly to anaglyphic and polarized 3D technologies, there are different factors between the technologies that have an effect on the amount of crosstalk. As a conclusion, it could be said that when designing a 3D technology-enabled experiences a decent amount of research should be dedicated to understanding the factors affecting the amount of crosstalk on that current approach. This statement is further strengthened by a statement found in 3D literature of that crosstalk is intrinsic to most of the technologies (Seuntiëns 2005).
The presence of crosstalk can lead to several problems such as general annoyance, visual discomfort, hindrance of fusing the images together, and breakdown of stereoscopic depth (Daly 2011, Woods 2004). Crosstalk is even considered to be one of the most annoying distortions in the visualization stage of stereoscopic imaging systems (Xing 2012, Seuntiëns 2005). At best, crosstalk is perceivable as a faint halo surrounding the edges of objects (Daly 2011) and at worst it contains all of the previously mentioned problems.
The amount of crosstalk is dependant on disparity and amplitude, with high levels resulting in ghosting (Daly 2011). According to Daly et al with small disparities and low amplitudes, such as in textures, crosstalk is perceivable only as a blur. With moderate amplitudes and disparities, crosstalk appears as tolerable double edges, while higher amplitudes can lead it to be displayed as an annoying ghost image. With even higher levels the double image disturbs stereoscopic fusion and prevents depth effect. Daly et al also point out other viewer experience such as general annoyance and discomfort, which can exist along with ghosting.
Another description of the effect on viewer experience can be found from a paper by Xing et al (Xing 2011). They also claim that comparatively few research efforts have been dedicated to this subject. Their paper does indeed describe an extensive study on the subjective experience of crosstalk. The described study investigated how much crosstalk can be perceived when it is visible. Additionally, this investigation focused on more realistic scenarios where natural scenes varying in crosstalk levels affect the perceptual attributes of crosstalk. This study led to finding three perceptual attributes of crosstalk, which can be categorized to 2D perceptual attributes that exist in a single eye view, and 3D perceptual attributes that exist once the view has been fused. Two of the attributes shadow degree of crosstalk and separation distance of crosstalk belong to 2D perceptual attributes while 3D attributes consist of spatial position.
Xing et al define Shadows degree of crosstalk as the distinctness of crosstalk when compared to the original view. The way it affects crosstalk is that when the level of shadow degree of crosstalk rises, the amount of crosstalk is experienced as more annoying. Xing et al. point out that the contrast of the scene content relates to the shadow degree of crosstalk. Another 2D perceptual attribute that they define is the separation distance of crosstalk, meaning the distance of crosstalk from the original view. According to the paper, crosstalk is experienced as being more annoying with increased separation distance. In addition to the 2D perceptual attributes, which also interact mutually, Xing et al list also one 3D perceptual attribute, spatial position. The spatial position is defined as the impact of crosstalk position in the 3D space on perception when the left and right views are fused and 3D perception is generated (Xing 2011). It is also heavily related to the two earlier mentioned 2D perceptual attributes, having an impact only on the visible crosstalk satisfying requirements of shadow degree and separation distance.
Crosstalk also affects the perceived depth in thin structures and in natural scenes (Tsirlin 2010, Tsirlin 2012). In a paper by Tsirlin et al published in 2010, they demonstrate that as the level of crosstalk increases the magnitude of perceived depth decreases. They also define a maximum level of crosstalk at 4%, at which the perceived depth and visual comfort are at suitable levels. These findings are backed supported by their 2012 paper, in which they assign the 4% level to synthetic images and a lower 2% level for natural images, resulting from crosstalk having a more disruptive effect on natural scenes.
As a conclusion to the subject of crosstalk’s effect on viewer experience, it can be argued that the effects are multiple and that there is also a connection between many of them, as demonstrated by Xing et al (Xing et al 2011). What comes to the level of crosstalk in the 3D system, 4 percentage seems to be the optimal according to Tsirlin et al, but there are also different views as Woods et al make a note of (Woods 2010).
Studying and Measuring Crosstalk
In order or to measure and conduct studies on crosstalk, a crosstalk-free technology has to be used. Earlier mentioned Wheatstone and Helmholtz approach is capable of reaching zero percent level of crosstalk and is therefore often used in crosstalk studies. Often the crosstalk effect is created digitally in this case, which allows the amount of crosstalk to be adjusted by the experimenter quite easily.
Crosstalk can be measured with optical sensors and with the use of visual measurement charts (Woods 2010). In the case of optical sensors, consideration has to be given to how well the sensor’s spectral sensitivity matches that of an eye. Traditionally the measurements are made by measuring the leakage between opposing channels when full-white is projected to the other one and full-black to the other one. This particular metric can be called black-and-white crosstalk and is often used, as it works with the crosstalk formulas described earlier. A variation of this metric is called grey-to-grey crosstalk, which is directed towards crosstalk measurement in time-sequential 3D systems, as crosstalk occurs differently in them. The basic method of this metric consists of measuring multiple grey level combinations and analyzing them.
While the optical sensor method is not necessarily a slow measuring method, a quicker way is to do the measuring with visual measurement charts. However, as this method is based on subjective evaluation there is a greater chance of error than when using sensors (Woods 2010). In addition to this, this measurement has also other limitations, for example, a requirement for calibration with correct gamma sensor, potential differentiation in crosstalk levels in the different parts of the screen, and chart not accounting the non-zero black level of some monitors.
The complete elimination of crosstalk is difficult, and methods that are close to succeeding in this are rarely compact (see Wheatstone & Helmholtz method) (Daly 2011). The methods for counteracting crosstalk are many, starting from enhancing the quality and build of 3D viewing hardware and ending in software-based compensation algorithms. This is a consequence of crosstalk being a sum of many factors. However, it must also be taken into consideration every cost and benefit trade-offs have to be done in order to reduce crosstalk with this method (Woods 2010).
Various signal-processing techniques are being developed, and have proven particularly valuable in hardware approaches using synchronized, alternating viewpoints (Daly 2011). Methods using liquid crystal displays, for example, can be improved with overdrive algorithms. Another algorithm method is known as L-R matrix compensation, which aims to anticipate and remove the ghost images from the images sent to the screen. This way the unwanted crosstalk signal, which is a result of the display process, is pre-subtracted from the digital image. Former method does not, however, work well when crosstalk is caused by high enough contrast. This will cause the subtraction to go negative, which will be impossible for the physical display to achieve, resulting in incomplete crosstalk correction.
Additionally to the algorithmic approach, one could also decrease the level of crosstalk by reducing the contrast ratio of the image, or even the display. Reducing the brightness of the display used for showing the 3D images would also be beneficial (Woods 2010). However would also result in reduced image quality, and for example, in the case of polarized glasses, the end quality could be questionable. This highlights the problem of multiple different technologies – a universal solution to counteracting crosstalk possibly impossible.
This essay has reviewed some of the scientific literature found on the crosstalk effect. The main foci of the paper have been on defining the concept of crosstalk, its relation to different stereoscopic technologies, its relation to the viewer experience, and the ways of counteracting it. In addition to the former, the essay has also briefly highlighted the technological aspects related to crosstalk. This paper cannot be described as a thorough inspection of the crosstalk phenomena but mainly an introduction to it. In order to achieve a more thorough look at the crosstalk phenomena, the essay should be narrowed down t a single stereoscopic imaging technology. This can be also said to be one of the essay’s shortcomings, as there exists a slight change of some of the technologies related to crosstalk getting confused due to the relatively light introduction to stereoscopic technologies.
This essay’s structure has been more akin to a scientific paper than one of an essay. Therefore there has been a relatively little reflection on the subject, and the focus has been mainly on presenting the knowledge found in the literature. However, I personally feel that traditional reflective essay, would not have worked in this case, as many of the subject’s properties are more a matter of fact that a subject of discussion. Another remark I want to make is that the amount of background material is quite small, and there was a noticeable problem of finding related literature that fit the scope of the essay. This can be due to having quite limited knowledge about the context, as the subject itself is relatively well researched.