Assessment of spatiotemporal gait parameters using a deep learning algorithm-based markerless motion capture system

Spatiotemporal parameters can characterize the gait patterns of individuals, allowing assessment of their health status and detection of clinically meaningful changes in their gait. Video-based markerless motion capture is a user-friendly, inexpensive, and widely applicable technology that could reduce the barriers to measuring spatiotemporal gait parameters in clinical and more diverse settings. Two studies were performed to determine whether gait parameters measured using markerless motion capture demonstrate concurrent validity with those measured using marker-based motion capture and pressure sensitive gait mats. For the first study, thirty healthy adults performed treadmill gait at self-selected speeds while marker-based motion capture and synchronized video data were recorded simultaneously. For the second study, twenty-five healthy adults performed over-ground gait at self-selected speeds while footfalls were recorded using a gait mat and synchronized video data were recorded simultaneously. Kinematic heel-strike and toe-off gait events were used to identify the same gait cycles between systems. Nine spatiotemporal gait parameters were measured by each system and directly compared between systems. Measurements were compared using Bland-Altman methods, mean differences, Pearson correlation coefficients, and intraclass correlation coefficients. The results indicate that markerless measurements of spatiotemporal gait parameters have good to excellent agreement with marker-based motion capture and gait mat systems, except for stance time and double limb support time relative to both systems and stride width relative to the gait mat. These findings indicate that markerless motion capture can adequately measure spatiotemporal gait parameters during treadmill and overground gait.


Introduction
Gait analysis is a useful tool for assessing and comparing human movement patterns to gain insight into a variety of health-related factors. Spatiotemporal gait parameters are one form of data obtained through gait analysis that have been shown to be useful clinical measures that can detect 'negative' changes in individuals' gait patterns due to pathology (Elbaz et al., 2014;Givon et al., 2009;Lemke et al., 2000;Morris et al., 2001) or aging (Hollman et al., 2011), and 'positive' changes due to rehabilitation (Fung et al., 2006;Patterson et al., 2008) or locomotor training (Abd El-Kafy and El-Basatiny, 2014; Smania et al., 2011;Vitale et al., 2012). They have been implemented to study the gait patterns of children (Alderson et al., 2019), older adults (Vallabhajosula et al., 2019), individuals with Parkinson's disease (Mondal et al., 2019), dementia (Darweesh et al., 2019), multiple sclerosis (Novotna et al., 2019), and post-stroke patients (Cleland et al., 2019) as a few examples. However, it is crucial that they are obtained using objective techniques to ensure adequate accuracy and repeatability (Toro et al., 2003).
Two of the most common technologies used to measure spatiotemporal gait parameters are marker-based motion capture and pressure-sensitive gait mats; however, both of these technologies have limitations that reduce their potential uses. Marker-based motion capture systems are expensive, require experienced operators, and are largely limited to in-laboratory data collections. Pressure-sensitive gait mats are less expensive and simpler to operate, but they require subjects to walk on their surface, reducing the potential environments in which they can be used and the data that can be collected.
Automated two-dimensional (2D) video-based markerless motion capture is an emerging technology that has the potential to measure spatiotemporal parameters without many of the limitations of marker-based motion capture or pressure-sensitive gait mats. Theia3D (Theia Markerless Inc., Kingston, ON) is one such markerless motion capture software, which uses synchronized video data and deep learning techniques to estimate three-dimensional (3D) human pose, enabling it to measure spatiotemporal gait parameters without the need for skin-mounted markers, a laboratory environment, or a specific walking surface.
The aim of this work was to determine the validity of spatiotemporal gait parameter measurements obtained using markerless motion capture. This was achieved through the completion of two studies: a concurrent comparison of spatiotemporal gait parameters from marker-based and markerless motion capture during treadmill walking, and a concurrent comparison of spatiotemporal gait parameters from a pressure-sensitive gait mat and markerless motion capture during over-ground walking.

Markerless Motion Capture
Theia3D (Theia Markerless Inc., Kingston, ON, Canada) is a deep learning algorithm-based approach to markerless motion capture which uses deep convolutional neural networks for feature recognition (humans and human features) within 2D camera views (Kanko et al., 2020a(Kanko et al., , 2020bMathis and Mathis, 2020). The neural networks were trained on digital images of over 500,000 humans in the wild. In the version used for this study, 51 salient features consisting of a variety of joint locations and other identifiable surface features in the images were manually labelled by highly trained annotators and controlled for quality by a minimum of one additional expert labeller. These training images consisted of humans in a wide array of settings, clothing, and performing various activities. Deconvolutional layers are used to produce spatial probability densities for each image, representing the likelihood that an anatomical feature is in a particular location. For a given image, the network assigns high probabilities to labeled anatomical feature locations and low probabilities elsewhere. The learning that occurs during training enables the application of "rules" for identifying the learned features within a new image.
When using this markerless motion capture software, the user provides newly collected video data from multiple synchronized and calibrated video cameras that capture one or more subjects performing a physical task. The time required to collect data is largely dependent on the task of interest but can take less than five minutes for the collection of ten walking trials, for example. Two-dimensional positions of the learned features are estimated within all frames of all of the videos, which are then transformed to 3D space based on the intrinsic and extrinsic parameters of the cameras. Finally, an articulated multi-body model is scaled to fit the subject-specific landmarks positions in 3D space, and a multi-body optimization approach (inverse kinematic (IK)) is used to estimate the 3D pose of the subject throughout the physical task. By default, the lower body kinematic chain has six degrees-of-freedom (DOF) at the pelvis, and three DOF at the hip, knee, and ankle. This markerless system has been shown to measure gait kinematics similarly to marker-based systems, and with greater reliability over multiple sessions (Kanko et al., 2020a(Kanko et al., , 2020b.

Participants
A convenience sample of healthy, recreationally active adults were recruited to participate in both studies at the Human Mobility Research Laboratory (Kingston, ON). Subject sample characteristics are summarized in Table 1. Participants gave written informed consent and both studies were approved by the institutional ethics committee. Exclusion criteria included having any neuromuscular or musculoskeletal impairments that could affect their performance of walking. Participants wore minimal, skin-tight clothing and their personal athletic shoes. Retroreflective markers were affixed bilaterally on the first, fifth, and between the second and third metatarsal heads, on the calcaneus, malleoli, tibial tuberosity, femoral epicondyles, anterior superior and posterior superior iliac spines, lateral iliac crest, suprasternal notch, C7 vertebrae, superior acromion, lateral humeral head, humeral epicondyles, radial and ulnar styloid processes, and the third metacarpus. Rigid tracking clusters (4 markers/cluster) were affixed to the shanks, thighs, and lower back, and a headband with five markers was worn.
A static calibration trial for the marker-based motion capture data was collected with the subject standing on the treadmill; no static trial is required for the markerless system. Starting at an initial speed of 1.2 m/s, participants determined a comfortable self-selected walking speed by providing feedback to researchers. Participants acclimatized to the treadmill for two minutes before ten consecutive trials of four seconds were collected simultaneously using both camera systems.

Pressure-Sensitive Gait Mat Comparison
A GAITRite mat (CIR Systems, Inc., Franklin, NJ) was positioned centrally within a large laboratory space, around which eight Sony RX0 II cameras (Sony Corporation, Minato, Japan) were arranged and synchronized using a Sony Camera Control Box. Red tape lines were placed on the ground ten meters apart as walkway start/finish lines, and the mat was positioned centrally between the lines.
Subjects wore their own clothing and shoes and performed six over-ground walking trials between the start/finish lines at their comfortable walking speed, alternating direction for each trial. Synchronized 2D video data and foot-ground contact positions and times were recorded during each trial, with both systems recording at 60 Hz. Foot placements that did not occur entirely within the pressure-sensitive region of the mat were not recorded.

Motion Capture Comparison
Marker-based motion capture data were tracked in QTM, video data were processed using Theia3D (v2020.3.0.962), and both sets of processed data were exported for analysis in Visual3D (C-Motion Inc., Germantown, MD). Two skeletal models were generated in Visual3D: one which tracked the markerless motion capture data that was automatically created by Visual3D, and one which tracked the marker-based data that was manually defined and had identical joint constraints as those used in Theia3D.
Kinematic gait events were created for the marker-based and markerless models independently, and were confirmed to represent the same gait events (Zeni et al., 2008).
Step length, stride length, stride width, step time, cycle time, swing time, stance time, double limb support time, and trial-average gait speed were calculated from the marker-based and markerless models and exported for further analysis in MATLAB (The MathWorks Inc., Natick, MA) and SPSS (IBM, Armonk, NY). These parameters are defined in Table 2. Twenty measurements of step length, stride length, step time, cycle time, swing time, and stance time; ten measurements of stride width; and five trial-average measurements of double limb support time were randomly selected from those available and directly compared between systems for each subject.

Gait Parameter Description
Gait Speed Distance covered per second, calculated as the measured stride length divided by the measured stride time, reported in meters per second [m/s].
Step Length Distance from the position of the proximal contralateral foot (ankle joint) at the previous contralateral heel-strike to the position of the proximal ipsilateral foot (ankle joint) at the ipsilateral heel-strike taken in the direction of progression, reported in centimeters [cm].

Stride Length
Distance from the position of the proximal ipsilateral foot (ankle joint) at ipsilateral heel-strike to the position of the proximal ipsilateral foot (ankle joint) at the successive ipsilateral heel-strike taken in the direction of progression, reported in centimeters [cm].

Stride Width
Perpendicular distance between the position of the proximal contralateral foot (ankle joint) at contralateral heel-strike to the vector between positions of the proximal ipsilateral foot (ankle joint) at successive ipsilateral heel-strikes, reported in centimeters [cm]. Step

Pressure-Sensitive Gait Mat Comparison
Spatiotemporal gait parameters were automatically calculated using GAITRite software (v. 4.0) and were exported for analysis in MATLAB. Video data were processed using Theia3D and exported to Visual3D, where the automatic skeletal model was generated. Kinematic gait events were created based on the markerless skeletal model for the gait cycles that occurred on the pressure-sensitive gait mat, which were confirmed to match those recorded by the mat. Gait parameters with definitions matching those generated by the GAITRite software were calculated in Visual3D and exported for further analysis alongside the gait mat measurements in MATLAB and SPSS. Ten measurements of step length, stride length, stride width, step time, cycle time, swing time, and stance time; and four trial-average measurements of double limb support time were randomly selected from those available and directly compared between systems for each subject.

Statistical Analysis
Means and standard deviations of the differences in gait parameter measurements made on an individual measurement basis between systems (marker-based and markerless motion capture; gait mat and markerless motion capture) were calculated across all measurements. Bland-Altman plots with bias and 95% limits of agreement (LOA) were created for each parameter (Bland and Altman, 1986).
Pearson's correlation coefficient (r) was calculated to assess correlation, and intraclass correlation coefficients (ICC (A-1)) were calculated to assess agreement (McGraw and Wong, 1996). ICC values of < 0.5 were interpreted as poor, 0.5-0.75 were interpreted as moderate, 0.75-0.9 were interpreted as good, and > 0.9 were interpreted as excellent (Portney and Watkins, 2009).

Results
Summary metrics for the differences between spatiotemporal gait parameters measured using the two systems in each study are shown in Table 3. Gait speed measurements were practically identical between marker-based and markerless motion capture systems, and very similar between the gait mat and markerless motion capture (Figure 1). Mean differences were 0.00 m/s and 0.02 m/s for the motion capture and gait mat comparisons, respectively. LOA for the gait mat comparison indicated markerless gait speed measurements are slightly but systematically slower than those from the gait mat. Correlation coefficients for both studies indicated near-perfect correlation and excellent agreement; however, the gait mat ICC lower bound indicated the possibility of poor agreement likely due to two outlier measurements (Table 3). Step and stride length had mean differences of less than 1 cm for both comparisons, with standard deviation values for both parameters of approximately 3.5 cm and 2.5 cm for the motion capture and gait mat comparisons, respectively. Bland-Altman LOA for these parameters were approximately +/-7 cm for the motion capture comparison and +/-5 cm for the gait mat comparison (Figure 2). Correlation coefficients indicated both parameters had strong correlations, excellent agreement in the motion capture comparison, and good to excellent agreement in the gait mat comparison. Stride width had a mean difference of less than 1 cm in the motion capture comparison and -3.66 cm in the gait mat comparison.
LOA were roughly +/-2 cm for the motion capture comparison and -8 cm to 1 cm for the gait mat comparison. ICC values and bounds indicated excellent agreement between the motion capture systems and poor to moderate agreement between the gait mat and markerless motion capture system.

Discussion
Spatiotemporal parameters are simple measures that effectively characterize gait patterns and allow overall health status to be monitored and changes to be detected, therefore holding potential for clinical applications (Givon et al., 2009;Hollman et al., 2011). Marker-based motion capture is a widely accepted technology that can measure spatiotemporal gait parameters; however, these systems are expensive, require dedicated laboratory space and experienced operators, and are time intensive to use, reducing their clinical applicability. Pressure-sensitive gait mats are another technology used in the measurement of spatiotemporal gait parameters and have had perhaps the greatest success in translation to clinical use due to their simplicity, ease of use, and lower cost. Despite the many benefits of these systems, they are limited to being used in straight, over-ground walking scenarios with their smooth, padded surface as the walking surface. These characteristics of the data collection conditions differ significantly from real-world walking, the majority of which is performed on inconsistent, rough surfaces with obstacles and turns to negotiate. Markerless motion capture presents an alternative that combines some of the advantages of marker-based motion capture and pressure-sensitive gait mats, including relatively low financial and expertise costs, reduced environment and walking surface requirements, and the option to collect data from more than indoor over-ground walking conditions. The aim of this work was to determine if spatiotemporal gait parameters for healthy gait measured using a markerless motion capture system were equivalent to those from a marker-based motion capture system and a pressuresensitive gait mat.
The results presented here showed that distance-based gait parameters measured using markerless motion capture demonstrated good to excellent agreement with those from marker-based motion capture and a pressure-sensitive gait mat. The agreement in gait speed, step length and stride length were particularly close between all three measurement systems. Conversely, stride width measurements in the gait mat comparison had low levels of agreement between systems, with the gait mat measuring considerably smaller stride widths than the markerless system. However, the stride width measurements obtained in the motion capture comparison demonstrated excellent agreement between the markerless and marker-based motion capture systems, indicating that the stride width as measured by the gait mat systematically differs from that measured by both motion capture systems.
The time-based gait parameters were generally measured with slightly lower agreement between systems, with the lowest levels of agreement observed in swing time and double limb support time in both comparisons. However, agreement between systems was excellent for cycle time, good for step time, and good for stance time. Furthermore, mean differences in time-based parameter measurements when represented as a difference in number of frames were approximately 0 to 4 frames for the motion capture comparison (at 85 Hz) or 0 to 3 frames for the gait mat comparison (at 60 Hz). Considering the use of kinematic-based gait events, these differences in event detection timing are unsurprising, and represent what is likely to be the biggest contributor to the differences in the measurement of temporal gait parameters observed here.
Minimal detectable change (MDC) values for these spatiotemporal gait parameters have been measured for a variety of pathological populations, including chronic stroke patients (Geiger et al., 2019), adults with multiple sclerosis (Andreopoulou et al., 2019), post-incomplete spinal chord injury patients (Nair et al., 2012), chronic low back pain patients (Fernandes et al., 2015), and adults with cerebral palsy (Levin et al., 2019) (Fernandes et al., 2015;Nair et al., 2012;Wittwer et al., 2013). Of the spatiotemporal gait parameters measured in this work, only the mean differences in stride width and double limb support time from the gait mat comparison were greater than their corresponding MDC values, indicating that on average the differences observed in measurements between systems would not limit the detection of clinically relevant changes in these gait parameters.
Despite the agreement observed between systems, there are limitations to the present study that should be considered. The subject samples were composed of healthy, active, young individuals which is not representative of the typically older, injured, or pathological population for which gait analysis is often used. In addition, since the markerless motion capture system is a purely image-based approach, its measurement of walking patterns is theoretically independent of the subject's health status, appearance, and the collection environment. However, the lack of sensitivity of the markerless system to these factors has yet to be confirmed. Subsequent work will investigate these factors and test the ability of markerless motion capture in wider applications.
The studies described in this work were performed using two different video camera systems, each with its own video recording characteristics (resolution, focal length, depth of field, etc.), as the input data to the markerless motion capture system. In addition, the studies used vastly different subject clothing conditions, with motion capture clothing and markers worn in the motion capture comparison and everyday clothing of the subjects' choice worn in the gait mat comparison. Despite these differences in collected data, the results indicate that neither the camera system nor the clothing conditions significantly affect the measurement of spatiotemporal gait parameters in either study. This supports the implementation of either camera system with Theia3D markerless motion capture software and indicates that both motion capture and everyday clothing can be used when collecting markerless data.
Based on the results presented here, markerless motion capture is capable of adequately measuring spatiotemporal gait parameters of healthy adults during treadmill and overground walking.
This initial demonstration of the accuracy of this system should prompt further investigation of the capability to measure spatiotemporal gait parameters of impaired gait and more environments.

Conflict of Interest Statement
Scott Selbie is the President and Marcus Brown is the Director of Technology of Theia Markerless Inc.