Using the News!
Last time, I tried out a couple different methods to assign sentiment to news articles, and found that the best performance seemed to come from using my Temporal Interference method initialized by zeroes. Well there’s a little more information available to us, and that’s the news article content themselves!
So the basic idea here is to train a model using Temporal Interference, and then use that model to score each news article, and use the scores as the new Perturbation model. This would potentially lead to an iterative process, but should eventually converge. Of course, there wasn’t a particularly clear way to do this, so I tried several.
For this first set of results, I tried various orders of modeling on the data at its actual duration value. For time’s sake I skipped data set 5 due to its size. Only results for 1-4 and 6 are shown:
Z,TI: 0.944, 0.902, 0.887, 0.852, 0.889, 0.708
Z,TI,NS: 0.858, 0.880, 0.810, 0.850, 0.854, 0.749
Z,TI,NS,TI: 0.951, 0.921, 0.907, 0.858, 0.744
Z,TI,NS,NS: 0.821, 0.853, 0.524, 0.761, 0.513
Z,TI,NS,NS,TI: 0.947, 0.906, 0.890, 0.853, 0.716
Z,TI,NS,TI,NS: 0.859, 0.880, 0.813, 0.848, 0.763
Z,TI,(NS,TI)x2: 0.951, 0.921, 0.907, 0.858, 0.747
What we learn here is that incorporating news information into the model does have an advantage. For the most part, we saw optimum performance alternating News and Temporal Interference, ending with Temporal Interference, however, in the case of data set 6, we saw the best performance when we ended with the News Analysis. This was somewhat counter-intuitive.
Also, with most data sets, the progress stabilized after just one iteration, however, this was not the case with data set 6. I tried adding two more steps. Adding another News step gave us an accuracy of 0.765, and adding another Temporal Interference step brought us back to 0.747. Considering the kind of information made available to the system, this is somewhat impressive, but really goes to show why my initial attempts at this definitely would not have ever worked.
I then considered what would happen if we did not know ahead of time what kind of duration to expect for perturbations. I did these studies on data set 3 only. If we reduce the duration to just 49, we see a significant drop in accuracy with Temporal Interference alone to 0.692. If we keep alternately adding NS/TI steps, we see the accuracy rise to 0.739, 0.702, 0.745, 0.703, 0.747, 0.703. Interestingly, this one also performs best when we end with the News analysis.
If we reduce the duration to 40, we see the accuracy decrease to 0.642. Running 2 NS/TI loops and ending with NS, we find the accuracy actually decrease to 0.624. Odd indeed!
For duration 51, we see a drop to 0.712, and then an improvement to 0.776.
For duration 60, we see a drop to 0.658, and then an improvement to 0.726.
Clearly something odd is going on here where we do not know the duration, and unfortunately, this is the case in real life. So we need to do some further thinking about Perturbation modeling if we want to move forward.



