A subscription to JoVE is required to view this content. Sign in or start your free trial.
Method Article
The human depth perception of 3D stereo videos depends on the camera separation, point of convergence, distance to, and familiarity of the object. This paper presents a robotized method for rapid and reliable test data collection during live open-heart surgery to determine the ideal camera configuration.
Stereo 3D video from surgical procedures can be highly valuable for medical education and improve clinical communication. But access to the operating room and the surgical field is restricted. It is a sterile environment, and the physical space is crowded with surgical staff and technical equipment. In this setting, unobscured capture and realistic reproduction of the surgical procedures are difficult. This paper presents a method for rapid and reliable data collection of stereoscopic 3D videos at different camera baseline distances and distances of convergence. To collect test data with minimum interference during surgery, with high precision and repeatability, the cameras were attached to each hand of a dual-arm robot. The robot was ceiling-mounted in the operating room. It was programmed to perform a timed sequence of synchronized camera movements stepping through a range of test positions with baseline distance between 50-240 mm at incremental steps of 10 mm, and at two convergence distances of 1100 mm and 1400 mm. Surgery was paused to allow 40 consecutive 5-s video samples. A total of 10 surgical scenarios were recorded.
In surgery, 3D visualization can be used for education, diagnoses, pre-operative planning, and post-operative evaluation1,2. Realistic depth perception can improve understanding3,4,5,6 of normal and abnormal anatomies. Simple 2D video recordings of surgical procedures are a good start. However, the lack of depth perception can make it hard for the non-surgical colleagues to fully understand the antero-posterior relationships between different anatomical structures and therefore also introduce a risk of misinterpretation of the anatomy7,8,9,10.
The 3D viewing experience is affected by five factors: (1) Camera configuration can either be parallel or toed-in as shown in Figure 1, (2) Baseline distance (the separation between the cameras). (3) Distance to the object of interest and other scene characteristics such as the background. (4) Characteristics of viewing devices such as screen size and viewing position1,11,12,13. (5) Individual preferences of the viewers14,15.
Designing a 3D camera setup begins with the capture of test videos recorded at various camera baseline distances and configurations to be used for subjective or automatic evaluation16,17,18,19,20. The camera distance must be constant to the surgical field to capture sharp images. Fixed focus is preferred because autofocus will adjust to focus on hands, instruments, or heads that may come into view. However, this is not easily achievable when the scene of interest is the surgical field. Operating rooms are restricted access areas because these facilities must be kept clean and sterile. Technical equipment, surgeons, and scrub nurses are often clustered closely around the patient to secure a good visual overview and an efficient workflow. To compare and evaluate the effect of camera positions on the 3D viewing experience, one complete test range of camera positions should be recording the same scene because the object characteristics such as shape, size, and color can affect the 3D viewing experience21.
For the same reason, complete test ranges of camera positions should be repeated on different surgical procedures. The entire sequence of positions must be repeated with high accuracy. In a surgical setting, existing methods that require either manual adjustment of the baseline distance22 or different camera pairs with fixed baseline distances23 are not feasible because of both space and time constraints. To address this challenge, this robotized solution was designed.
The data was collected with a dual-arm collaborative industrial robot mounted in the ceiling in the operating room. Cameras were attached to the wrists of the robot and moved along an arc-shaped trajectory with increasing baseline distance, as shown in Figure 2.
To demonstrate the approach, 10 test series were recorded from 4 different patients with 4 different congenital heart defects. Scenes were chosen when a pause in surgery was feasible: with the beating hearts just before and after surgical repair. Series were also made when the hearts were arrested. The surgeries were paused for 3 min and 20 s to collect forty 5-ssequences with different camera convergence distances and baseline distances to capture the scene. The videos were later post-processed, displayed in 3D for the clinical team, who rated how realistic the 3D video was along a scale from 0-5.
The convergence point for toed-in stereo cameras is where the center points of both images meet. The convergence point can, by principle, be placed either in front, within, or behind the object, see Figure 1A-C. When the convergence point is in front of the object, the object will be captured and displayed left of the midline for the left camera image and right of the midline for the right camera image (Figure 1A). The opposite applies when the convergence point is behind the object (Figure 1B). When the convergence point is on the object, the object will also appear in the midline of the camera images (Figure 1C), which presumably should yield the most comfortable viewing since no squinting is required to merge the images. To achieve comfortable stereo 3D video, the convergence point must be located on, or slightly behind, the object of interest, else the viewer is required to voluntarily squint outwards (exotropia).
The data was collected using a dual-arm collaborative industrial robot to position the cameras (Figure 2A-B). The robot weighs 38 kg without equipment. The robot is intrinsically safe; when it detects an unexpected impact, it stops moving. The robot was programmed to position the 5 Megapixel cameras with C-mount lenses along an arc-shaped trajectory stopping at predetermined baseline distances (Figure 2C). The cameras were attached to the robot hands using adaptor plates, as shown in Figure 3. Each camera recorded at 25 frames per second. Lenses were set at f-stop 1/8 with focus fixed on the object of interest (approximated geometrical center of the heart). Every image frame had a timestamp which was used to synchronize the two video streams.
Offsets between the robot wrist and the camera were calibrated. This can be achieved by aligning the crosshairs of the camera images, as shown in Figure 4. In this setup, the total translational offset from the mounting point on the robot wrist and the center of the camera image sensor was 55.3 mm in the X-direction and 21.2 mm in the Z-direction, displayed in Figure 5. The rotational offsets were calibrated at a convergence distance of 1100 mm and a baseline distance of 50 mm and adjusted manually with the joystick on the robot control panel. The robot in this study had a specified accuracy of 0.02 mm in Cartesian space and 0.01 degrees rotational resolution24. At a radius of 1100 m, an angle difference of 0.01 degrees offsets the center point 0.2 mm. During the full robot motion from 50-240 mm separation, the crosshair for each camera was within 2 mm from the ideal center of convergence.
The baseline distance was increased stepwise by symmetrical separation of the cameras around the center of the field of view in increments of 10 mm ranging from 50-240 mm (Figure 2). The cameras were kept at a standstill for 5 s in each position and moved between the positions at a velocity of 50 mm/s. The convergence point could be adjusted in X and Z directions using a graphical user interface (Figure 6). The robot followed accordingly within its working range.
The accuracy of the convergence point was estimated using the uniform triangles and the variable names in Figure 7A and B. The height 'z' was calculated from the convergence distance 'R' with the Pythagorean theorem as
When the real convergence point was closer than the desired point, as shown in Figure 7A, the error distance 'f1' was calculated as
Similarly, when the convergence point was distal to the desired point, the error distance 'f2' was calculated as
Here, 'e' was the maximum separation between the crosshairs, at most 2 mm at maximum baseline separation during calibration (D = 240 mm). For R = 1100 mm (z = 1093 mm), the error was less than ± 9.2 mm. For R = 1400 mm (z = 1395 mm), the error was ± 11.7 mm. That is, the error of the placement of the convergence point was within 1% of the desired. The two test distances of 1100 mm and 1400 mm were therefore well separated.
Access restricted. Please log in or start a trial to view this content.
The experiments were approved by the local Ethics Committee in Lund, Sweden. The participation was voluntary, and the patients' legal guardians provided informed written consent.
1. Robot setup and configuration
NOTE: This experiment used a dual-arm collaborative industrial robot and the standard control panel with a touch display. The robot is controlled with RobotWare 6.10.01 controller software and robot integrated development environment (IDE) RobotStudio 2019.525. Software developed by the authors, including the robot application, recording application, and postprocessing scripts, are available at the GitHub repository26.
CAUTION: Use protective eyeglasses and reduced speed during setup and testing of the robot program.
2. Verify the camera calibration
3. Preparation at the start of the surgery
4. Experiment
CAUTION: All personnel should be informed about the experiment beforehand.
5. Repeat
6. Postprocessing
NOTE: The following steps can be carried out using most video editing software or the provided scripts in the postprocessing folder.
7. Evaluation
Access restricted. Please log in or start a trial to view this content.
An acceptable evaluation video with the right image placed at the top in top-bottom stereoscopic 3D is shown in Video1. A successful sequence should be sharp, focused, and without unsynchronized image frames. Unsynchronized video streams will cause blur, as shown in the file Video 2. The convergence point should be centered horizontally, independent of the camera separation, as seen in Figure 9A,B. When the robot transitions between the posi...
Access restricted. Please log in or start a trial to view this content.
During live surgery, the total time of the experiment used for 3D video data collection was limited to be safe for the patient. If the object is unfocused or overexposed, the data cannot be used. The critical steps are during camera tool calibration and setup (step 2). The camera aperture and focus cannot be changed when the surgery has started; the same lighting conditions and distance should be used during setup and surgery. The camera calibration in steps 2.1-2.4 must be carried out carefully to ensure that the heart ...
Access restricted. Please log in or start a trial to view this content.
The authors have nothing to disclose.
The research was carried out with funding from Vinnova (2017-03728, 2018-05302 and 2018-03651), Heart-Lung Foundation (20180390), Family Kamprad Foundation (20190194), and Anna-Lisa and Sven Eric Lundgren Foundation (2017 and 2018).
Access restricted. Please log in or start a trial to view this content.
Name | Company | Catalog Number | Comments |
2 C-mount lenses (35 mm F2.1, 5 M pixel) | Tamron | M112FM35 | Rated for 5 Mpixel |
3D glasses (DLP-link active shutter) | Celexon | G1000 | Any compatible 3D glasses can be used |
3D Projector | Viewsonic | X10-4K | Displays 3D in 1080, can be exchanged for other 3D projectors |
6 M2 x 8 screws | To attach the cXimea cameras to the camera adaptor plates | ||
8 M2.5 x 8 screws | To attach the circular mounting plates to the robot wrist | ||
8 M5 x 40 screws | To mount the robot | ||
8 M6 x 10 screws with flat heads | For attaching the circular mounting plate and the camera adaptor plates | ||
Calibration checker board plate (25 by 25 mm) | Any standard checkerboard can be used, including printed, as long as the grid is clearly visible in the cameras | ||
Camera adaptor plates, x2 | Designed by the authors in robot_camera_adaptor_plates.dwg, milled in aluminium. | ||
Circular mounting plates, x2 | Distributed with the permission of the designer Julius Klein and printed with ABS plastic on an FDM 3D printer. License Tecnalia Research & Innovation 2017. Attached as Mountingplate_ROBOT_SIDE_ NewDesign_4.stl | ||
Fix focus usb cameras, x2 (5 Mpixel) | Ximea | MC050CG-SY-UB | With Sony IMX250LQR sensor |
Flexpendant | ABB | 3HAC028357-001 | robot touch display |
Liveview | recording application | ||
RobotStudio | robot integrated development environment (IDE) | ||
USB3 active cables (10.0 m), x2 | Thumbscrew lock connector, water proofed. | ||
YuMi dual-arm robot | ABB | IRB14000 |
Access restricted. Please log in or start a trial to view this content.
Request permission to reuse the text or figures of this JoVE article
Request PermissionThis article has been published
Video Coming Soon
Copyright © 2025 MyJoVE Corporation. All rights reserved