Skip to main content
Log in

To log, or not to log: using heuristics to identify mandatory log events – a controlled experiment

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Context

User activity logs should capture evidence to help answer who, what, when, where, why, and how a security or privacy breach occurred. However, software engineers often implement logging mechanisms that inadequately record mandatory log events (MLEs), user activities that must be logged to enable forensics.

Goal

The objective of this study is to support security analysts in performing forensic analysis by evaluating the use of a heuristics-driven method for identifying mandatory log events.

Method

We conducted a controlled experiment with 103 computer science students enrolled in a graduate-level software security course. All subjects were first asked to identify MLEs described in a set of requirements statements during the pre-period task. In the post-period task, subjects were randomly assigned statements from one type of software artifact (traditional requirements, use-case-based requirements, or user manual), one readability score (simple or complex), and one method (standards-, resource-, or heuristics-driven). We evaluated subject performance using three metrics: statement classification correctness (values from 0 to 1), MLE identification correctness (values from 0 to 1), and response time (seconds). We test the effect of the three factors on the three metrics using generalized linear models.

Results

Classification correctness for statements that did not contain MLEs increased 0.31 from pre- to post-period task. MLE identification correctness was inconsistent across treatment groups. For simple user manual statements, MLE identification correctness decreased 0.17 and 0.12 for the standards- and heuristics-driven methods, respectively. For simple traditional requirements statements, MLE identification correctness increased 0.16 and 0.17 for the standards- and heuristics-driven methods, respectively. Average response time decreased 41.7 s from the pre- to post-period task.

Conclusion

We expected the performance of subjects using the heuristics-driven method to improve from pre- to post-task and to consistently demonstrate higher MLE identification correctness than the standards-driven and resource-driven methods across domains and readability levels. However, neither method consistently helped subjects more correctly identify MLEs at a statistically significant level. Our results indicate additional training and enforcement may be necessary to ensure subjects understand and consistently apply the assigned methods for identifying MLEs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. http://www.qualtrics.com/

  2. http://go.ncsu.edu/loggingexperiment

  3. This study was approved by the North Carolina State University Institutional Review Board (#5354)

  4. http://www.ihris.org/

  5. http://agile.csc.ncsu.edu/iTrust

  6. https://pkp.sfu.ca/ocs/

  7. To calculate readability metric values, we use the calculators provided by https://readability-score.com/

  8. http://go.ncsu.edu/loggingexperiment

  9. http://go.ncsu.edu/loggingexperiment

  10. http://www.jmp.com

  11. http://go.ncsu.edu/loggingexperiment

  12. http://go.ncsu.edu/loggingexperiment

References

Download references

Acknowledgments

This work is funded by the United States National Security Agency (NSA) Science of Security Lablet. Any opinions expressed in this report are those of the author(s) and do not necessarily reflect the views of the NSA. We thank the Realsearch research group for providing helpful feedback on this work. This study was approved by the North Carolina State University Institutional Review Board (#5354).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jason King.

Additional information

Communicated by: Richard Paige, Jordi Cabot and Neil Ernst

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

King, J., Stallings, J., Riaz, M. et al. To log, or not to log: using heuristics to identify mandatory log events – a controlled experiment. Empir Software Eng 22, 2684–2717 (2017). https://doi.org/10.1007/s10664-016-9449-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-016-9449-1

Keywords

Navigation