Hello!
I'd like to report a number of issues in CDK sources which were found by IntelliJ IDEA dataflow analysis. Here I mention only some important (to my opinion) issues which require attention. I don't report here redundant checks which are look harmless or potential inter-procedural nullability problems if I'm unsure that NPE could actually occur. I'm not going to submit a pull-request fixing these issues, because I'm not familiar with the code and may only speculate on how it should be properly fixed.
- org.openscience.cdk.renderer.generators.standard.AbbreviationLabel#norm
if (c > 128)
return 0;
switch (c) {
case '\u002d': // hyphen
case '\u2012': // figure dash << this branch is unreachable as c > 128 was excluded before
case '\u2013': // en-dash << same
case '\u2014': // em-dash << same
case '\u2212': // minus << same
return '-'; // 002d
default:
return c;
}
- org.openscience.cdk.qsar.descriptors.molecular.ALOGPDescriptor#getHAtomType (at the end of the method):
if (oxNum == 0) {
if (hybrid.equals("sp3")) { ...
} else if (hybrid.equals("sp2")) return 47;
} else if (oxNum == 1 && hybrid.equals("sp3"))
return 47;
else if ((oxNum == 2 && hybrid.equals("sp3")) || (oxNum == 1 && hybrid.equals("sp2"))
|| (oxNum == 0 && hybrid.equals("sp"))) // << oxNum == 0 && hybrid.equals("sp") is always false as oxNum == 0 was processed above. If this condition is true, 0 will be returned instead
return 48;
...
- org.openscience.cdk.geometry.AtomTools#calculate3DCoordinates0
Point3d points[] = new Point3d[0];
if (nwanted == 1) {
points = new Point3d[1];
points[0] = new Point3d(aPoint);
points[0].add(new Vector3d(length, 0.0, 0.0));
} else if (nwanted == 2) {
points[0] = new Point3d(aPoint); // << here points is zero-length array, so ArrayIndexOutOfBoundsException will occur. Probably points = new Point3d[2] is missing before this line
points[0].add(new Vector3d(length, 0.0, 0.0));
points[1] = new Point3d(aPoint);
points[1].add(new Vector3d(-length, 0.0, 0.0));
The same problem in nwanted == 3 and nwanted == 4 branches, these branches will always fail with AIOOBE
- org.openscience.cdk.qsar.descriptors.bond.BondSigmaElectronegativityDescriptor#setParameters
if (!(params[0] instanceof Integer)) {
throw new CDKException("The parameter must be of type Integer");
}
if (params.length == 0) return; // << cannot be true, because params[0] was checked before. If params is empty array, we would have AIOOBE thrown on the previous condition if(!(params[0] instanceof Integer))
maxIterations = (Integer) params[0];
org.openscience.cdk.qsar.descriptors.atomic.SigmaElectronegativityDescriptor#setParameters - same
- org.openscience.cdk.atomtype.CDKAtomTypeMatcher#perceiveBromine
} else if ((atom.getFormalCharge() != CDKConstants.UNSET && atom.getFormalCharge() == -1)) {
IAtomType type = getAtomType("Br.minus");
if (isAcceptable(atom, atomContainer, type)) return type;
} else if (atom.getFormalCharge() == 1) { // << CDKConstants.UNSET is null. If it's really possible to see null here, then you will have NPE during auto-unboxing in this comparison. If it's impossible, then previous "!= CDKConstants.UNSET" check is redundant
- org.openscience.cdk.smsd.filters.ChemicalFilters#getBondTypeMatches
if (rStereo != 4 || pStereo != 4 || rStereo != 3 || pStereo != 3) { // << cannot be false as any number is either not 4 or not 3. Likely && was intended instead of ||
- org.openscience.cdk.io.CIFReader#readChemFile (close to the method end)
if (line != null) line = line.trim();
while (!line.equals(";")) { // either null-check is redundant or (more likely) NPE is possible here
- org.openscience.cdk.io.CrystClustReader#readChemFile (close to the method end)
while (!line.startsWith("frame:") && input.ready() && line != null) { // << line null-check after line.startsWith. Either null-check is redundant or NPE is possible here
- org.openscience.cdk.math.IMatrix#IMatrix(org.openscience.cdk.math.Matrix)
public IMatrix(Matrix m) {
rows = m.rows;
columns = m.columns;
int i, j;
for (i = 0; i < rows; i++)
for (j = 0; j < columns; j++) {
realmatrix[i][j] = m.matrix[i][j]; // << this constructor does not allocate realmatrix and imagmatrix arrays, so NPE will occur when this constructor is used
imagmatrix[i][j] = 0d;
}
}
- org.openscience.cdk.qsar.descriptors.atomic.IPAtomicHOSEDescriptor#extractInfo
int countSpace = 0;
boolean foundDigit = false;
for (int i = 0; i < strlen; i++) {
if (!foundDigit) if (Character.isLetter(str.charAt(i))) foundDigit = true;
if (foundDigit) {
if (Character.isWhitespace(str.charAt(i))) {
if (countSpace == 0) { // << countSpace is never modified, so it's always 0 and condition is always true. Probably it's a mistake or unfinished code
foundSpace = true;
org.openscience.cdk.qsar.descriptors.atomic.ProtonAffinityHOSEDescriptor#extractInfo - same problem
- org.openscience.cdk.smsd.Isomorphism#getEuclideanDistance
if (getFirstMapping() != null || !getFirstMapping().isEmpty()) { // << the second part is executed only if getFirstMapping() returned null, in which case NPE is inevitable. Likely && was intended instead of ||
- org.openscience.cdk.smsd.algorithm.mcgregor.McGregor#checkmArcs
List<Integer> posNumList = new ArrayList<Integer>(size); // size does not set list size, it sets the initial size of backing array
for (int i = 0; i < posNumList.size(); i++) { // << this loop is never visited as posNumList is empty
posNumList.add(i, 0);
}
- org.openscience.cdk.io.formats.MDLFormat#matches
if (lineNumber == 4 && line.length() > 7 && (line.indexOf("2000") == -1) && // MDL Mol V2000 format
(line.indexOf("3000") == -1)) // MDL Mol V3000 format
{
// possibly a MDL mol file
try {
String atomCountString = line.substring(0, 3).trim();
String bondCountString = line.substring(3, 6).trim();
Integer.valueOf(atomCountString);
Integer.valueOf(bondCountString);
if (line.length() > 6) { // << always true as line.length() > 7 was already checked before; probably just redundant, but could be a typo
- org.openscience.cdk.io.MDLV2000Writer#writeMolecule (around line 500)
while (last >= 0) {
if (atomprops[last] != 0)
break;
last--;
}
// matches BIOVIA syntax
if (last >= 2 && last < atomprops.length) // << last < atomprops.length is always true as before exiting the previous loop we accessed atomprops[last]. Seems just redundant condition, but probably a mistake
last = 5;
- org.openscience.cdk.iupac.parser.MoleculeBuilder#addFunGroup (line 189)
addAtom("Cl", currentMolecule.isEmpty() ? null : currentMolecule.isEmpty() ? null : currentMolecule.getAtom(0), IBond.Order.SINGLE, 0); // << currentMolecule.isEmpty() is checked twice. Probably just redundant check, but something else could be intended
- org.openscience.cdk.io.cml.PDBConvention#endElement
if (currentMolecule instanceof IAtomContainer) { // << always true except currentMolecule == null case as currentMolecule declared type is IAtomContainer. Condition could be removed or replaced with a null check if null case is possible
logger.debug("Adding molecule to set");
currentMoleculeSet.addAtomContainer(currentMolecule);
logger.debug("#mols in set: " + currentMoleculeSet.getAtomContainerCount());
} else if (currentMolecule instanceof ICrystal) { // << always false as ICrystal extends IAtomContainer, so this branch is never visited. Probably 'else' is redundant
logger.debug("Adding crystal to chemModel");
currentChemModel.setCrystal((ICrystal) currentMolecule);
currentChemSequence.addChemModel(currentChemModel);
}
- org.openscience.cdk.io.RssWriter#write
if (object instanceof IAtomContainer) {
root = convertor.cdkAtomContainerToCMLMolecule((IAtomContainer) object);
} else if (object instanceof ICrystal) { // << similar to previous case: this branch is unreachable
root = convertor.cdkCrystalToCMLMolecule((ICrystal) object);
- org.openscience.cdk.smsd.algorithm.vflib.VFlibMCSHandler#mcgregorFlag
if (commonAtomCount > vfMCSSize && commonAtomCount > vfMCSSize) { // << repeated check, probably something else was intended
org.openscience.cdk.smsd.algorithm.vflib.VFlibSubStructureHandler#mcgregorFlag, org.openscience.cdk.smsd.algorithm.vflib.VFlibTurboHandler#mcgregorFlag - same
19. org.openscience.cdk.smsd.algorithm.vflib.VFlibTurboHandler#setVFMappings
Integer qIndex = Integer.valueOf(getReactantMol().indexOf(qAtom));
Integer tIndex = Integer.valueOf(getProductMol().indexOf(tAtom));
if (qIndex != null && tIndex != null) { // << condition is alays true: Integer.valueOf never returns null. Probably it was intended to check for -1 as -1 is returned by indexOf when atom not found
atomatomMapping.put(qAtom, tAtom);
indexindexMapping.put(qIndex, tIndex);
} else {
try {
throw new CDKException("Atom index pointing to NULL");
} catch (CDKException ex) {
LOGGER.error(Level.SEVERE, null, ex);
}
}
Hello!
I'd like to report a number of issues in CDK sources which were found by IntelliJ IDEA dataflow analysis. Here I mention only some important (to my opinion) issues which require attention. I don't report here redundant checks which are look harmless or potential inter-procedural nullability problems if I'm unsure that NPE could actually occur. I'm not going to submit a pull-request fixing these issues, because I'm not familiar with the code and may only speculate on how it should be properly fixed.
The same problem in nwanted == 3 and nwanted == 4 branches, these branches will always fail with AIOOBE
org.openscience.cdk.qsar.descriptors.atomic.SigmaElectronegativityDescriptor#setParameters - same
org.openscience.cdk.qsar.descriptors.atomic.ProtonAffinityHOSEDescriptor#extractInfo - same problem
org.openscience.cdk.smsd.algorithm.vflib.VFlibSubStructureHandler#mcgregorFlag, org.openscience.cdk.smsd.algorithm.vflib.VFlibTurboHandler#mcgregorFlag - same
19. org.openscience.cdk.smsd.algorithm.vflib.VFlibTurboHandler#setVFMappings