Fix keyword expansion in files from SVN repositories
Submitted
Updated 8 months, 4 weeks ago
| Christian Hammond | Reviewers | ||
| trunk | reviewboard | ||
| 362 | |||
| None | Review Board SVN | ||
Fixed issues with SVN keyword expansion. If a file had $Id$ or $Date$ or something and the keyword was defined in that file's svn:keywords property, svn cat would return the file with the keyword expanded. This broke patching for us and essentially meant that any patches modifying anything near a keyword expansion would break. We now collapse the keyword by turning $Keyword:.*$ into $Keyword$. We only do this for keywords actually defined in that file's svn:keywords, in order to prevent collapsing things that aren't actually defined keywords but look like them. This also supports known keyword aliases.
Unit tests succeed! Debug output shows we're collapsing correctly.
Posted 9 months ago (February 22nd, 2008, 9:11 p.m.)
This is a great start! I imagine that this alone will fix the issue for most of my users, so let's please get this in. However, there are a few corner cases I can think of that the current test doesn't expose:
1. Fixed-length substitution: By using a double-colon, you can tell SVN the exact length of the field. e.g.,
$Revision:: $
becomes
$Revision:: 42 $
2. Keyword aliases: There are only 5 keywords that SVN understands, but several have equivalent aliases. The worst part is that expansion still happens if the file uses a different alias than what's in svn:keywords.
3. Non-keywords: The property can contain any arbitrary keywords, and any invalid keywords are silently ignored. Your code will happily collapse anything. Admittedly, it would be a quite malformed repo for someone to run into this.
-
/trunk/reviewboard/scmtools/svn.py (Diff revision 1) 65 keyword_names = keywords[normpath].replace(",", "|")
-
The docs on svn:keywords specify that they are delimited by whitespace, not commas.
I sure wish we didn't have to re-implement all of SVN's crazy keyword rules to make this robust. The most painful part is that SVN checkouts do have an unexpanded copy on the client side in .svn/text-base/foobar.svn-base! I can't find any way to capture that using pysvn though, unless you actually create a temporary working directory. (yuck)
Posted 9 months ago (February 22nd, 2008, 9:11 p.m.)
This is a great start! I imagine that this alone will fix the issue for most of my users, so let's please get this in. However, there are a few corner cases I can think of that the current test doesn't expose:
1. Fixed-length substitution: By using a double-colon, you can tell SVN the exact length of the field. e.g.,
$Revision:: $
becomes
$Revision:: 42 $
2. Keyword aliases: There are only 5 keywords that SVN understands, but several have equivalent aliases. The worst part is that expansion still happens if the file uses a different alias than what's in svn:keywords.
3. Non-keywords: The property can contain any arbitrary keywords, and any invalid keywords are silently ignored. Your code will happily collapse anything. Admittedly, it would be a quite malformed repo for someone to run into this.
I sure wish we didn't have to re-implement all of SVN's crazy keyword rules to make this robust. The most painful part is that SVN checkouts do have an unexpanded copy on the client side in .svn/text-base/foobar.svn-base! I can't find any way to capture that using pysvn though, unless you actually create a temporary working directory. (yuck)
Posted 8 months, 4 weeks ago (February 24th, 2008, 4:51 p.m.)
-
/trunk/reviewboard/scmtools/svn.py (Diff revision 2) 14 keyword_aliases = {
15 'Date': ['LastChangedDate'],
16 'Revision': ['LastChangedRevision', 'Rev'],
17 'Author': ['LastChangedBy'],
18 'HeadURL': ['URL'],
19 } -
This table needs to have the reverse mappings as well.
-
/trunk/reviewboard/scmtools/svn.py (Diff revision 2) 108 keywords = keyword_str.split(" ")
109 110 # Get any aliased keywords111 keywords += [alias
112 for keyword in keywords
113 for alias in self.keyword_aliases.get(keyword, [])]
-
If you make keyword_aliases include self-references (e.g., "Id": ["Id"], "URL": ["URL", "HeadURL"], etc.), and then build the final list purely from the alias table, instead of appending, then unknown keywords will be ignored (corner case #3). You'll also prevent any weird values from going into '|'.join(keywords) which might break the regular expression. I know these cases are unlikely to crop up in real repositories, but it's pretty easy to be protective here.
-
/trunk/reviewboard/scmtools/tests.py (Diff revision 2) 261 newfile = patch(diff, file, filename) # Throws an exception!
269 print "Resulting collapsed file: [%s]" % file
-
Debug statement...
With those tweaks, this will cover all the cases that I can think of. Thanks!
-
/trunk/reviewboard/scmtools/svn.py (Diff revision 4) 15 keywords = {
-
Need "Id" and "Rev" in here also.
Looks good - please commit!