Attending: Naveed, Kelly, Eileen
Naveed is looking at papers that might give insight into the problem of comparing clusters of users sessions, as generated by H-UNC and by SCFC, to evaluating clustering algorithms, and to characterizing user behavior of those assigned to a particular cluster.
Naveed presents on : Mining Web Logs for Prediction Models in WWW Caching and Prefetching by a group at Simon Fraser and IBM
GDSF is the page replacement algorithm in use. (greedy dual size frequency)
... Idea is to dynamically decided what to prefetch and cache to get better performance.
How to apply:
For our data: -------------- Calculate top n most frequently accessed pages. Calculate top n most frequently accessed page pairs. Calculate top n most frequently accessed page triples. (Can we work in statistical significance, likelihood of access??)See links to potentially good papers in bookmarks.txt