Skip to content

Fix threadsafety#55

Closed
stdweird wants to merge 2 commits intoquattor:masterfrom
stdweird:fix_threadsafety
Closed

Fix threadsafety#55
stdweird wants to merge 2 commits intoquattor:masterfrom
stdweird:fix_threadsafety

Conversation

@stdweird
Copy link
Member

No description provided.

@stdweird
Copy link
Member Author

I'm investiagting a few issues with the compiler, these are the results of this work. Don't merge this in yet.

@stdweird
Copy link
Member Author

This is the coverity report

1. read_volatile: Reading index, a volatile field, without any lock held.
 2. intervening_update: Another thread writes to index.
 CID 1128482 (#1 of 1): Volatile not atomically updated (VOLATILE_ATOMICITY)
3. stale_update: Updating index based on a stale value. Any intervening update in another thread is overwritten.

@stdweird
Copy link
Member Author

The crash happens rather frequently: building our profiles from scratch gives 7 failures on 25 builds. After the path, this gives 0 on 25.
This is the stack trace with the crash

compile.cluster.profiles:
     [echo] Cluster: delcatty in /home/stdweird/.git/ugenthpc/quattor/quattor/cfg/clusters
     [echo] 
     [panc] 346/346 object template(s) being processed
     [panc] java.lang.NullPointerException
     [panc]     at java.util.TreeMap.rotateLeft(TreeMap.java:2060)
     [panc]     at java.util.TreeMap.fixAfterInsertion(TreeMap.java:2127)
     [panc]     at java.util.TreeMap.put(TreeMap.java:574)
     [panc]     at org.quattor.pan.dml.data.HashResource.put(HashResource.java:103)
     [panc]     at org.quattor.pan.type.RecordType.setDefaults(RecordType.java:231)
     [panc]     at org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc]     at org.quattor.pan.type.AliasType.setDefaults(AliasType.java:96)
     [panc]     at org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc]     at org.quattor.pan.type.RecordType.setDefaults(RecordType.java:233)
     [panc]     at org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc]     at org.quattor.pan.type.AliasType.setDefaults(AliasType.java:96)
     [panc]     at org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc]     at org.quattor.pan.type.RecordType.setDefaults(RecordType.java:211)
     [panc]     at org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc]     at org.quattor.pan.type.AliasType.setDefaults(AliasType.java:96)
     [panc]     at org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc]     at org.quattor.pan.tasks.BuildTask$CallImpl.setDefaults(BuildTask.java:224)
     [panc]     at org.quattor.pan.tasks.BuildTask$CallImpl.call(BuildTask.java:125)
     [panc]     at org.quattor.pan.tasks.BuildTask$CallImpl.call(BuildTask.java:74)
     [panc]     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc]     at org.quattor.pan.cache.AbstractCache.retrieve(AbstractCache.java:137)
     [panc]     at org.quattor.pan.cache.AbstractCache.waitForResult(AbstractCache.java:171)
     [panc]     at org.quattor.pan.tasks.Valid1Task$CallImpl.call(Valid1Task.java:85)
     [panc]     at org.quattor.pan.tasks.Valid1Task$CallImpl.call(Valid1Task.java:68)
     [panc]     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc]     at org.quattor.pan.cache.AbstractCache.retrieve(AbstractCache.java:137)
     [panc]     at org.quattor.pan.cache.AbstractCache.waitForResult(AbstractCache.java:171)
     [panc]     at org.quattor.pan.tasks.Valid2Task$CallImpl.call(Valid2Task.java:82)
     [panc]     at org.quattor.pan.tasks.Valid2Task$CallImpl.call(Valid2Task.java:65)
     [panc]     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc]     at org.quattor.pan.cache.AbstractCache.retrieve(AbstractCache.java:137)
     [panc]     at org.quattor.pan.cache.AbstractCache.waitForResult(AbstractCache.java:171)
     [panc]     at org.quattor.pan.tasks.WriteOutputTask$CallImpl.call(WriteOutputTask.java:89)
     [panc]     at org.quattor.pan.tasks.WriteOutputTask$CallImpl.call(WriteOutputTask.java:61)
     [panc]     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc]     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
     [panc]     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc]     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
     [panc]     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
     [panc]     at java.lang.Thread.run(Thread.java:744)
     [panc] type lookup failed for 'ganesha_nfsv4'
     [panc] org.quattor.pan.exceptions.CompilerError.create(CompilerError.java:58)
     [panc] org.quattor.pan.type.AliasType.setDefaults(AliasType.java:100)
     [panc] org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc] org.quattor.pan.type.RecordType.setDefaults(RecordType.java:233)
     [panc] org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc] org.quattor.pan.type.AliasType.setDefaults(AliasType.java:96)
     [panc] org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc] org.quattor.pan.type.RecordType.setDefaults(RecordType.java:211)
     [panc] org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc] org.quattor.pan.type.AliasType.setDefaults(AliasType.java:96)
     [panc] org.quattor.pan.type.FullType.setDefaults(FullType.java:153)
     [panc] org.quattor.pan.tasks.BuildTask$CallImpl.setDefaults(BuildTask.java:224)
     [panc] org.quattor.pan.tasks.BuildTask$CallImpl.call(BuildTask.java:125)
     [panc] org.quattor.pan.tasks.BuildTask$CallImpl.call(BuildTask.java:74)
     [panc] java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc] org.quattor.pan.cache.AbstractCache.retrieve(AbstractCache.java:137)
     [panc] org.quattor.pan.cache.AbstractCache.waitForResult(AbstractCache.java:171)
     [panc] org.quattor.pan.tasks.Valid1Task$CallImpl.call(Valid1Task.java:85)
     [panc] org.quattor.pan.tasks.Valid1Task$CallImpl.call(Valid1Task.java:68)
     [panc] java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc] org.quattor.pan.cache.AbstractCache.retrieve(AbstractCache.java:137)
     [panc] org.quattor.pan.cache.AbstractCache.waitForResult(AbstractCache.java:171)
     [panc] org.quattor.pan.tasks.Valid2Task$CallImpl.call(Valid2Task.java:82)
     [panc] org.quattor.pan.tasks.Valid2Task$CallImpl.call(Valid2Task.java:65)
     [panc] java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc] org.quattor.pan.cache.AbstractCache.retrieve(AbstractCache.java:137)
     [panc] org.quattor.pan.cache.AbstractCache.waitForResult(AbstractCache.java:171)
     [panc] org.quattor.pan.tasks.WriteOutputTask$CallImpl.call(WriteOutputTask.java:89)
     [panc] org.quattor.pan.tasks.WriteOutputTask$CallImpl.call(WriteOutputTask.java:61)
     [panc] java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc] java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
     [panc] java.util.concurrent.FutureTask.run(FutureTask.java:262)
     [panc] java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
     [panc] java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
     [panc] java.lang.Thread.run(Thread.java:744)
     [panc] 
     [panc] 346 templates
     [panc] 346/346 compiled, 0/0 annotations, 690/692 xml, 0/0 dependency
     [panc] 1 errors, 6,761 ms, 697 MB/1,749 MB heap, 15 MB/130 MB nonheap
     [panc] 

@stdweird stdweird mentioned this pull request May 24, 2014
@loomis loomis added the bug label May 25, 2014
@loomis
Copy link

loomis commented May 25, 2014

I've checked in a correction which makes the ListResource iterator thread safe without impacting performance. This will probably have no impact on the issue as iterators are never (AFAIK) used from multiple threads in the code.

Making the HashResource thread safe would be difficult to do. Using another map implementation means that the ordering of the keys is lost, so all of the serializers would need to be modified to do the key ordering themselves. Synchronization is possible, but as you've seen has a strong impact on the performance.

The root problem probably lies in the default setting algorithm in which appropriate clones of HashResources are not made when defaults are being set.

@stdweird
Copy link
Member Author

see issue #56 for followup

@stdweird stdweird closed this Jun 23, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

2 participants