Detecting Nicotine Addiction via Eye Tracking

Eye tracking is a nascent method to identify and assess nicotine dependence. This is an exploratory to study to evaluate eye tracking based protocols. Results indicate the promising nature of the protocol. Further investigations evaluating the validity of eye tracking as a paradigm would be needed to develop a widely accessible method to identify and assess nicotine dependence that does not require medical background to administer. This would be important in mitigating the growing smoking epidemic especially in developing countries.


INTRODUCTION
World Health Organization (WHO) estimates [1] say that 20.2% of the world's population aged over 15 were smokers in 2015. The estimates look especially bad in the European region where the prevalence is estimated to be 38.7% in males and 21% in females. In India [2], 19% of men, 2% of women and 10.7% of adults (99.5 million) smoke tobacco. Also, 38.7% of adults are exposed to second hand smoke at home in India. Tobacco smoking is known to cause and aggravate a plethora of health problems. Smoking is a public health crisis due to its wide availability, addictive nature and impact on people who do not smoke via second hand smoke.
The treatment of any addiction adheres to the tangible outcomes of helping the person stop using, stay drug free and be productive in the society. Depending on the severity of the addiction treatment can range from use of behavioural counselling to use of pharmacotherapy like nicotine patches and gum and taking help from support groups. In a scenario with limited medical resources, as is often the case in developing nations and sometimes in developed nations too, selecting individuals for treatment and identifying appropriate measures is important to be able to mitigate addiction as a social epidemic.
The other way to detect the amount of recent use of nicotine is by detecting one of the numerous biomarkers that result from smoking tobacco. The most common methods being measuring breath carbon monoxide, testing saliva, blood, urine and hair. The markers are Nicotine, Cotinine, Antabine, exhaled CO, Carboxy-haemoglobin, Acetonitrile, Thiocyanate etc [6].
The above summarized measures come with limitations of their own. The subjective measures are human administered tests that inherently makes them prone to bias and manipulation. Also, some of the tests requires highly trained professionals to administer them, limiting the environments they can be used in. The objective measures are invasive in nature and take time and sophisticated equipment to process. The resources required to perform these tests again limits the outreach and availability of these tests.
FTND is one of the most widely used scales to measure nicotine dependence severity [7]. It is one of the most objective human administered questionnaires for measuring nicotine dependence severity.
It assigns a score from 0 to 10 based on the answers given to 6 multiple choice questions. The simplicity and the objective nature of the test is the reason we are going to try and predict FTND scores using eye tracking measurements.
The theory of incentive salience [8] postulates that addictive drugs enhance the pleasure rewarding dopamine pathways. These pathways attach "incentive salience" as an attribute to related stimuli.
Incentive Salience refers to the want experienced for a rewarding stimulus. Repeated usage increases the aforementioned effect perhaps to permanency. These effects can happen independent of the subjective "pleasure" associated with the drugs. Addictive drugs share the ability to enhance the mesotelencephalic dopamine neurotransmission, one of the functions of which is to attribute 'incentive salience' to perception and mental representation of stimuli associated with activation of the system. Simplified this means that when a person is presented with a stimulus that can activate this system the psychological process of incentive salience transforms the mental translation of it imbuing them with 'salience' i.e. making them attractive, 'wanted' stimulus. The takeaway here is that the brain responds differently to stimuli based on the craving associated with them, especially drugs.
Attentional bias [9] refers to the tendency of perception to be affected by recurring thoughts. These 2 theories provide the tenets for the design of the tasks. Attentional bias and craving have a mutually excitatory relation-ship thus giving basis for correlation to be seen in the resulting behaviour. This background support forms the basis of the tasks being designed to quantify the response to visual stimuli that elicit craving. These two theories have been well validated over the years and form the basic tenets of the task design in this study.
The measurement of eye tracking towards drug related cues is emerging as a useful method to study drug seeking behaviour. This has been examined for different substances including smoking [10] [11], alcohol [12], cocaine [13] and morphine and methamphetamine [14] with promising results.
All studies to date have examined correlation between eye tracking measurements and nicotine dependence. We have taken this a step further by designing tasks that can be administered with simplicity and then a score is assigned by a machine learning algorithm completely eliminating human bias from the assessment.

METHODS Participants
A total of 30 participants (15 smokers and 15 non-smokers) completed the study. The participation was approved by the Institute Ethics Committee of IIT Delhi. All data collection was performed under strict ethical standards including informed consent, complete confidentiality and anonymisation of data. All subjects were male students from IIT Delhi. The distribution of subjects by age and FTND scores is summarised in Table 1.
Each subject was provided with written details of the tasks along with the contact details of the investigators conducting the study. Then a modified version of Mini-International Neuropsychiatric Interview (MINI) [15] was used to evaluate if the subject satisfied any of the exclusion criteria based on psychiatric disorders. Participants with physical limitations obstructing participation in the study and ones with history of learning disability or traumatic brain injury were also excluded from the study.
A demographic information questionnaire was filled out by each participant that qualified the inclusion criteria. After this the FTND [5] was administered and participants with 0 and non-zero scores were divided into the control and nicotine dependent groups respectively. Measures were taken to ensure that the distribution of the FTND score in the nicotine dependent group was as close to uniform as possible.

Apparatus and Setup
The tasks used visual cues of images, video and text presented on a 15.6-inch screen. The participant sat on a chair while the screen and the eye tracker were setup on a table in front on them in an isolated light adjusted environment. The screen was adjusted to be at a viewing distance of ~1m from the subject and the height was adjusted to ensure that the horizontal line of sight was directed at the centre of the screen. The subject was left alone in the room after being given all instructions to ensure the data collection being free of interruptions. The Eye Tribe eye tracker that uses infrared to track eye movement on a computer screen and provides the x and y coordinates of the gaze. The data collection frequency was set to 60Hz. To improve the accuracy of measurements from the tracker logistic regression classifier trained using hand tagged data to reject erroneous measurements and exponential smoothing were used.

Task Design
Each participant performed 4 tasks. The tasks were preceded by a 16-point calibration. In the first task an image would appear at one of four positions in a 2x2 grid on the screen, the other 3 positions displaying a red X, and the subject had to look away from the image as quickly as possible. This task was borrowed from the attentional bias study pertaining to cocaine performed by Dias [13] and measures the anti-saccade response of the subject.
In the second task an image is displayed at the centre of the screen while a pointer (red X) traverses a trajectory of 2 concentric circles, centred at the centre of the screen, starting with the outer circle then the inner circle. The subject is instructed to follow the motion of the pointer. This task aims to measure the pro-saccade response from the subject.
In the third task a red X is displayed at the centre of the screen while an image traverses a trajectory of 2 concentric circles, centred at the centre of the screen, starting with the outer circle then the inner circle. The subject is instructed to fix their gaze at the centre red X in the centre of the screen.
This task also aims to measure the pro-saccade response from the subject. The difference from task 2 being that this task has a non-stationary stimulus.
All of the first 3 tasks are performed with 16 different images, separate for each task, divided into 4 sets of 4 each with each set containing 2 smoking related images and 2 neutral images. Images in a set are selected to ensure similarity in presentation of focal visual stimulus to ensure parity. The design is done in such a way so that we have comparisons across 2 dimensions measurements from non-smokers vs smokers and measurements on smoking related stimuli vs neutral stimuli. Each image is separated by a gap of 2s and each set is punctuated by 10s to allow the subject to recentre the gaze and blink if needed. Tasks 2 and 3 are original tasks designed by the investigators.
The fourth task is a modified version of the n-Back test [17]. A video containing smoking related stimulus and one containing a neutral stimulus is played in a size lesser than that of the screen leaving space above and below the video. During the video 6 digits appear one by one randomly above or below the video at randomly spaced intervals. At the end of the video the subject is asked to reproduce the sequence. This task measures the is intended to measure attentiveness.
Illustrations of the tasks are presented in Figure 1.

Feature Extraction
The raw eye tracking data recorded as timestamp and corresponding x and y coordinates of the gaze from each stimulus (image/video) of each task was processed through custom pattern extraction algorithms to produce semantic features. The choice to limit granularity of feature extraction to a single stimulus instance was to ensure avoidance of capturing noisy artifacts that could be a part of the data. The features from each stimulus are aggregated over all smoking related stimuli and neutral stimuli separately to yield features corresponding to each task. A list of all features extracted are available in Table S1 Feature Engineering and ML modelling All features were evaluated for predictive information contained in them using correlation coefficient with corresponding FTND scores. We identified that due to 6 outlier subjects in most cases this correlation was not very valuable. The nature of these 6 outlier subjects was such that non-smokers were distinctively displaying behaviour that would be associated with smokers as per the hypothesis and vice versa. It is possible that such behaviour could happen due to unconscious effect of incentive salience guiding their actions. This could also not be remedied through outlier analysis given the small size of the sample.
Due to above limitations we first chose to analyse the features extracted from the tasks by trying to classify subjects into smokers and non-smokers. The class of machine learning models chosen was tree-based classifiers. The modelling had two variants, one where features from each task were used independently and the second where features from all tasks were used together to classify the users.
The models used were random forest classifier and gradient boosted tree classifier. The gradient boosted tree performed better in all scenarios except for task 4. Gradient boosted trees use the boosting approach to learn sequential classifiers that improve upon the performance of the model so far and were expected to perform better in almost all cases than the random forest approach.
The training was done using leave two out cross validation leaving one smoker and one non smoker out of the training set to ensure maximal usage of the limited dataset. The hyperparameters were chosen based on a grid search on values that were appropriate based on domain knowledge and the size of the dataset. The python library scikit-learn was used to train the models [16].
Based on the correlation analysis the number of switches in direction feature was found to correlate particularly well with the FTND score with a coefficient of 0.625. As an exploration we trained a linear regression model to use this feature to predict the FTND score for a given subject. The behaviour that was visualised and used to extract this feature for all subjects is presented in Figure   2. Similar comparisons for Task 1 data are present in Figures S1 and S2.
The data and code used for feature extraction and modelling is available at https://github.com/sharanmayank/smoking_severity_analysis.

RESULTS
The performance and quality of trained classifier models has been quantified using classification accuracy and of the regression model has been measured using normalized mean absolute error, normalized mean squared error, R-squared value (coefficient of determination) and p-values of the learned coefficients.
The number of trees and the maximum depth of the trees for both types of classifier models were chosen based on the number of features available to the classifier and a final determination by grid search. The results are summarized in Table 2.  As noted before there are a few outliers in the data set which prevent perfect accuracy in prediction but even with that we achieved an accuracy of 83.33% using features from all tasks. In individual tasks the features from task 1 performed best achieving an accuracy of 80%. The linear regression model was able to predict the FTND score with a normalized mean absolute error of 0.732, the fit The feature weights are available for the selected models in Tables S2 to S6.

Model Used Accuracy Precision (S) Recall (S) Precision (NS) Recall (NS) Task 1 Gradient Boosted Trees
The classification accuracy shows that the models are able to predict whether a subject is a smoker or a non-smoker well and the regression model performance shows potential to develop the paradigm to be able to predict severity of nicotine addiction.

DISCUSSION
This study explored visual stimulus-based tasks using eye tracking measurements to evaluate nicotine dependence and attempting to quantify the same against a reference of a well validated and objective measure like FTND. Results indicated significant separation and trend in the semantic features extracted from the raw data of the tasks to be able to classify subjects with low error using classification models and predict FTND scores using linear regression with statistical significance.
This suggests that incentive salience and attentional bias lead to measurable manifestations in response to visual stimulus that can be used to determine the nicotine dependence of a subject.
Numerous studies over the last 10 to 12 years have used eye tracking measurements to establish correlation between metrics extracted from eye tracking data in response to visual stimuli and craving or dependence on a drug. A 2012 study by Kang et al [10] used eye tracking and fMRI to confirm a positive correlation between different smoking-related cue reactivities, such as attentional bias and subjective craving, and functional brain response in various individuals. Work by Lochbuehler et al [11] on measuring attentional bias in smokers on exposure to smoking cues in movies showed that smoking cues have direct impact on the attention of smokers. A similar study published in 2015 measuring alcohol attentional bias in adolescent social drinkers [12] showed that there is some evidence of attention bias in adolescent social drinkers that can be measured using eye tracking. Evidence of capability of eye tracking to measure attentional bias in cases of drug dependence is established for other drugs such as cocaine, morphine and methamphetamine also [13] [14].
This work is a natural extension to reinforce the validity of eye tracking measurements to evaluate attentional bias and use the measurements to be able to identify and assess nicotine dependence.
To the best of the authors' knowledge this is the first study to use eye tracking measurements, extract features and use them to predict nicotine dependence or the severity of nicotine dependence.
One limitation of this study is that the sample was relatively small in size and lacked diversity especially with respect to gender. Despite the limitation the results reinforce the initial hypothesis that eye tracking can be used to identify nicotine dependence and shows promise in being able to predict the severity of nicotine addiction. Further studies are needed to validate this result by testing on a larger and more diverse sample. This would also allow to correct for outliers in the data and likely yield a much more robust model. This protocol can be further extended by addition of more tasks and verification by cross checking the accuracy and usability of this paradigm with different subjective and objective measures of nicotine dependency.