Login | Register
My pages Projects Community openCollabNet

Discussions > dev > [TSVN] RFC: New cache scheme

Project highlights: :. Download .: :. Support .: :. FAQ .: :. Translations .: :. Donate .: :. Report Bug .:

tortoisesvn
Discussion topic

Back to topic list

[TSVN] RFC: New cache scheme

Author Will Dean <svn at indcomp dot co dot uk>
Full name Will Dean <svn at indcomp dot co dot uk>
Date 2005-01-20 13:21:54 PST
Message Guys,

Now that 1.1.3 is out of the way, I'd like to say something about the new
cache I've been working on. It's very preliminary at the moment, but I
thought this would be a good time to get comments about the concept.

I think everybody's aware that the TSVN shell extension does lots of
caching of file status, to try and minimise the amount of SVN work which
goes on as one is browsing through Explorer. In normal situations, this
cache is now pretty good, and I generally find performance to be
acceptable. (I first got involved with the TSVN code because I was so
infuriated by the shell performance - that was partly an SVN problem as it
turned out.)

Anyway, among all its charm, the current shell extension does have some
problems:

1. In order to prevent the cache becoming stale (and, I think, because of
historical concern about the amount of memory it might consume), cached
items have very short lifetimes (a few seconds). In certain cases (large
directories, slow filesystems), this can cause a pathological cache
thrashing, where the time taken to build the cache exceeds the lifetime of
its members. This is disastrous, as the very time you need the cache most
is when it's slow to build. There is plenty of sticking plaster stuck on
this particular wound, but it's not very pretty.

2. Unless you're in recursive status mode, the cache only holds the status
for one folder.

3. The SVN libraries are statically linked and big and slow-to-start. The
shell extension has to include them in order to get item status. Every
process which starts a file-open dialog (not exactly a lightweight activity
at the best of times) has to suffer SVN starting-up and loading
into-process the first time the dialog is opened.

4. Because the shell extension is an in-process COM object (shell
extensions are supposed to be in-process, this isn't a mistake), there is
one cache per process. With the current very short cache lifetimes, this
doesn't really make any difference to anybody, but it could be a
missed-opportunity in terms of re-use of cached items. (For example, I
think it's reasonably probable that you'll have Explorer windows and app
file-open boxes pointing into similar folders.)

5. Shell extensions are a pig to debug.

I have been working on a completely different way of doing things, which
shows some promise. It goes as follows:

1. Create a new application 'TSVNCache', which can run in the background,
with a simple IPC interface which allows other processes to request the SVN
status of a path. There's no U/I on this application.

2. Rip all the SVN status stuff out of the shell extension and replace it
with something which asks TSVNCache for the status of a path. The shell
extension knows nothing about SVN except for the arrangement of a
svn_wc_status_t structure (which is what it's given by TSVNCache). The
cache knows nothing about the shell extension or why it wants the status of
the file, it just returns the status. To take this step to the limit, the
property-page handler would probably need to come out into a separate DLL,
because it's always going to need SVN.

At this point, we've probably slowed things down slightly, because there's
now an inter-process call (on a named pipe) between the shell extension and
the cache. However, the cache is now a nice little stand-alone process,
which one can start and stop at will, play around with and debug
easily. (If you stop TSVNCache, the shell extension just marks things as
unversioned, connecting to the cache again when it restarts.)

So, the next step is to improve the cache:

3. Separate the caching of files and folders, so that you can build a big
cache without needing to search a huge list of unstructured file names.

4. Increase the cache-lifetime (let's say that it's infinite)

5. Keep track of the modification time of files which are cached, and the
modification time of the .svn\entries file, and use these as hints to
invalidate the cache. Note that these hints are agnostic about the client
you use, so you can use the SVN CL and the cache will still be invalidated
properly.

.... This is about where I've got to at the moment ....

I don't currently implement recursive folder status, but my idea for this
is to do something along the following lines:

1. Fetch the minimum required status information synchronously, as at the
moment.
2. As a lazy, background task, recurse downwards from each folder which is
cached, calculating the dominant SVN status for each folder.
3. Issue shell-update requests for folders as their recursive status
becomes known.

Because the cache is now so durable, usable recursive status becomes a real
possibility, which I don't feel it is at the moment (it's more of a
tantalising peek at how good it could be).

So, what do people think about all this? I'm particularly interested in
people's views on the legitimacy of my cache invalidation strategy, but I'd
welcome any input.

(Just for interest, I started by trying to implement something based on
change notifications, which would have meant I could then have the cache
generate all the shell-update notifications, but I don't think this is very
scaleable.)

When I get this a bit more together, I shall also be looking for some
"people with enquiring minds" to try it out.

Cheers,

Will


--------------------​--------------------​--------------------​---------
To unsubscribe, e-mail: dev-unsubscribe@tort​oisesvn.tigris.org
For additional commands, e-mail: dev-help at tortoisesvn dot tigris dot org

« Previous message in topic | 1 of 74 | Next message in topic »

Messages

Show all messages in topic

[TSVN] RFC: New cache scheme Will Dean <svn at indcomp dot co dot uk> Will Dean <svn at indcomp dot co dot uk> 2005-01-20 13:21:54 PST
     Re: [TSVN] RFC: New cache scheme Toby Johnson <toby at etjohnson dot us> Toby Johnson <toby at etjohnson dot us> 2005-01-20 14:14:32 PST
         RE: [TSVN] RFC: New cache scheme =?iso-8859-1?Q?L=FCbbe_Onken?= <l dot onken at rac dot de> =?iso-8859-1?Q?L=FCbbe_Onken?= <l dot onken at rac dot de> 2005-01-21 00:01:39 PST
             RE: [TSVN] RFC: New cache scheme Will Dean <svn at indcomp dot co dot uk> Will Dean <svn at indcomp dot co dot uk> 2005-01-21 00:48:44 PST
                 Re: [TSVN] RFC: New cache scheme steveking_0073 Stefan 2005-01-21 01:24:03 PST
                     RE: [TSVN] RFC: New cache scheme "Hughes, Bill" <Bill dot Hughes at cox dot co dot uk> "Hughes, Bill" <Bill dot Hughes at cox dot co dot uk> 2005-01-21 01:56:09 PST
                         RE: [TSVN] RFC: New cache scheme Will Dean <svn at indcomp dot co dot uk> Will Dean <svn at indcomp dot co dot uk> 2005-01-21 02:08:51 PST
                     Re: [TSVN] RFC: New cache scheme Will Dean <svn at indcomp dot co dot uk> Will Dean <svn at indcomp dot co dot uk> 2005-01-21 02:30:56 PST
                         Re: [TSVN] RFC: New cache scheme galb Joseph Galbraith 2005-01-21 06:45:09 PST
                             Re: [TSVN] RFC: New cache scheme Will Dean <svn at indcomp dot co dot uk> Will Dean <svn at indcomp dot co dot uk> 2005-01-21 06:52:50 PST
     Re: [TSVN] RFC: New cache scheme steveking_0073 Stefan 2005-01-21 00:44:06 PST
         Re: [TSVN] RFC: New cache scheme mcnab_p Peter McNab 2005-01-21 02:44:27 PST
             RE: [TSVN] RFC: New cache scheme "Hughes, Bill" <Bill dot Hughes at cox dot co dot uk> "Hughes, Bill" <Bill dot Hughes at cox dot co dot uk> 2005-01-21 02:28:59 PST
                 RE: [TSVN] RFC: New cache scheme Will Dean <svn at indcomp dot co dot uk> Will Dean <svn at indcomp dot co dot uk> 2005-01-21 03:29:47 PST
                     Re: [TSVN] RFC: New cache scheme steveking_0073 Stefan 2005-01-21 04:02:26 PST
                         Re: [TSVN] RFC: New cache scheme Will Dean <svn at indcomp dot co dot uk> Will Dean <svn at indcomp dot co dot uk> 2005-01-21 04:37:08 PST
                             [TSVN] Re: RFC: New cache scheme Simon Large <slarge at blazepoint dot co dot uk> Simon Large <slarge at blazepoint dot co dot uk> 2005-01-21 06:30:34 PST
                                 Re: [TSVN] Re: RFC: New cache scheme Will Dean <svn at indcomp dot co dot uk> Will Dean <svn at indcomp dot co dot uk> 2005-01-21 06:35:01 PST
                             Re: [TSVN] RFC: New cache scheme steveking_0073 Stefan 2005-01-21 06:41:24 PST
                                 Re: [TSVN] RFC: New cache scheme Will Dean <svn at indcomp dot co dot uk> Will Dean <svn at indcomp dot co dot uk> 2005-01-21 06:49:42 PST
                                     Re: [TSVN] RFC: New cache scheme Mark Phippard <MarkP at softlanding dot com> Mark Phippard <MarkP at softlanding dot com> 2005-01-21 06:55:41 PST
                                         RE: [TSVN] RFC: New cache scheme "Hughes, Bill" <Bill dot Hughes at cox dot co dot uk> "Hughes, Bill" <Bill dot Hughes at cox dot co dot uk> 2005-01-21 05:54:52 PST
                                             Re: [TSVN] RFC: New cache scheme SteveKing <steveking at gmx dot ch> SteveKing <steveking at gmx dot ch> 2005-01-21 08:01:50 PST
                                                 RE: [TSVN] RFC: New cache scheme "Hughes, Bill" <Bill dot Hughes at cox dot co dot uk> "Hughes, Bill" <Bill dot Hughes at cox dot co dot uk> 2005-01-21 06:32:43 PST
                                         Re: [TSVN] RFC: New cache scheme raimue Rainer Müller 2005-01-21 09:19:06 PST
Page: of 3 « Previous | Next »
Messages per page: