Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-16903: Migrate off java.io.File to java.nio.file.Path from core files #2924

Merged
merged 10 commits into from
Feb 1, 2025
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import jakarta.ws.rs.Path;
import jakarta.ws.rs.PathParam;
import jakarta.ws.rs.QueryParam;
import java.io.IOException;
import org.apache.solr.client.api.model.SolrJerseyResponse;

/**
Expand Down Expand Up @@ -63,5 +64,6 @@ SolrJerseyResponse getFile(
String getFrom,
@Parameter(description = "Indicates that (only) file metadata should be fetched")
@QueryParam("meta")
Boolean meta);
Boolean meta)
throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to confirm this throw was needed? Maybe because of something else furthur down?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no it isn't needed. It was due to my changes on List<FileDetails> list() method under DistribFileStore so I threw a SolrException instead which seems correct.

}
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
package org.apache.solr.cloud;

import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.lang.invoke.MethodHandles;
Expand Down Expand Up @@ -121,7 +120,7 @@ public InputStream openResource(String resource) throws IOException {

try {
// delegate to the class loader (looking into $INSTANCE_DIR/lib jars)
is = classLoader.getResourceAsStream(resource.replace(File.separatorChar, '/'));
is = classLoader.getResourceAsStream(Path.of(resource).toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that actually works, I think a little comment is needed like "normalize path separator"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think it doesn't work; not on Windows. I thought I commented elsewhere in that lengthy review that all Path.of(resource).toString() that you are adding is probably flawed

} catch (Exception e) {
throw new IOException("Error opening " + resource, e);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
*/
package org.apache.solr.core;

import java.io.File;
import java.io.IOException;
import java.lang.invoke.MethodHandles;
import java.nio.file.DirectoryStream;
Expand Down Expand Up @@ -367,10 +366,7 @@ private void close(CacheValue val) {
}

private static boolean isSubPath(CacheValue cacheValue, CacheValue otherCacheValue) {
int one = cacheValue.path.lastIndexOf(File.separatorChar);
int two = otherCacheValue.path.lastIndexOf(File.separatorChar);

return otherCacheValue.path.startsWith(cacheValue.path + File.separatorChar) && two > one;
return Path.of(otherCacheValue.path).startsWith(Path.of(cacheValue.path));
mlbiscoc marked this conversation as resolved.
Show resolved Hide resolved
}

@Override
Expand Down
8 changes: 4 additions & 4 deletions solr/core/src/java/org/apache/solr/core/CoreDescriptor.java
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
*/
package org.apache.solr.core;

import java.io.File;
import java.io.IOException;
import java.io.Reader;
import java.lang.invoke.MethodHandles;
Expand Down Expand Up @@ -64,7 +63,7 @@ public class CoreDescriptor {
public static final String SOLR_CORE_PROP_PREFIX = "solr.core.";

public static final String DEFAULT_EXTERNAL_PROPERTIES_FILE =
dsmiley marked this conversation as resolved.
Show resolved Hide resolved
"conf" + File.separator + "solrcore.properties";
Path.of("conf/solrcore.properties").toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any chance of different file seperators? I wasn't sure if that was one reason for using the File.separator, like on windows? Or is that one of the benefits of Path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about this so started reading Path.of() documentation and oracle documentation. Filesystems.getDefault().getPath(String) which should return the OS path and developers can use the standard / regardless. It's also a benefit of Path for Solr to no longer have to do all these File.separator chars or OS checking...

I haven't actually tested this on something like a windows operating system to confirm though...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really great benefit of the migration!


/**
* Whether this core was configured using a configSet that was trusted. This helps in avoiding the
Expand Down Expand Up @@ -96,7 +95,7 @@ public Properties getPersistableUserProperties() {
CORE_CONFIG, "solrconfig.xml",
CORE_SCHEMA, "schema.xml",
CORE_CONFIGSET_PROPERTIES, ConfigSetProperties.DEFAULT_FILENAME,
CORE_DATADIR, "data" + File.separator,
CORE_DATADIR, Path.of("data/").toString(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trailing slash should ideally not be necessary. I'd prefer to just drop it and we deal with the consequences. If needed, the previous code was clearer IMO so I'd rather see something similarly clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

CORE_TRANSIENT, "false",
CORE_LOADONSTARTUP, "true");

Expand Down Expand Up @@ -240,7 +239,8 @@ public CoreDescriptor(
* <p>File paths are taken as read from the core's instance directory if they are not absolute.
*/
protected void loadExtraProperties() {
String filename = coreProperties.getProperty(CORE_PROPERTIES, DEFAULT_EXTERNAL_PROPERTIES_FILE);
String filename =
coreProperties.getProperty(CORE_PROPERTIES, DEFAULT_EXTERNAL_PROPERTIES_FILE.toString());
Path propertiesFile = instanceDir.resolve(filename);
if (Files.exists(propertiesFile)) {
try (Reader r = Files.newBufferedReader(propertiesFile, StandardCharsets.UTF_8)) {
Expand Down
82 changes: 43 additions & 39 deletions solr/core/src/java/org/apache/solr/core/DirectoryFactory.java
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,14 @@
package org.apache.solr.core;

import java.io.Closeable;
import java.io.File;
import java.io.FileFilter;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.lang.invoke.MethodHandles;
import java.nio.file.Files;
import java.nio.file.NoSuchFileException;
import java.nio.file.Path;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.stream.Stream;
import org.apache.commons.io.file.PathUtils;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FilterDirectory;
Expand Down Expand Up @@ -338,59 +335,66 @@ public String getDataHome(CoreDescriptor cd) throws IOException {
}

public void cleanupOldIndexDirectories(
final String dataDirPath, final String currentIndexDirPath, boolean afterCoreReload) {
final String dataDirPath, final String currentIndexDirPath, boolean afterCoreReload)
throws IOException {

// TODO SOLR-8282 move to PATH
File dataDir = new File(dataDirPath);
if (!dataDir.isDirectory()) {
Path dataDirFile = Path.of(dataDirPath);
mlbiscoc marked this conversation as resolved.
Show resolved Hide resolved
if (!Files.isDirectory(dataDirFile)) {
log.debug(
"{} does not point to a valid data directory; skipping clean-up of old index directories.",
dataDirPath);
return;
}

final File currentIndexDir = new File(currentIndexDirPath);
File[] oldIndexDirs =
dataDir.listFiles(
new FileFilter() {
@Override
public boolean accept(File file) {
String fileName = file.getName();
return file.isDirectory()
&& !file.equals(currentIndexDir)
&& (fileName.equals("index") || fileName.matches(INDEX_W_TIMESTAMP_REGEX));
}
});
final Path currentIndexDir = Path.of(currentIndexDirPath);
List<Path> dirsList;
try (Stream<Path> oldIndexDirs = Files.list(dataDirFile)) {
dirsList =
oldIndexDirs
.filter(
(file) -> {
String fileName = file.getFileName().toString();
return Files.isDirectory(file)
&& !file.equals(currentIndexDir)
&& (fileName.equals("index") || fileName.matches(INDEX_W_TIMESTAMP_REGEX));
})
.sorted()
.toList();
}
;

if (oldIndexDirs == null || oldIndexDirs.length == 0)
if (dirsList.isEmpty()) {
return; // nothing to do (no log message needed)

List<File> dirsList = Arrays.asList(oldIndexDirs);
dirsList.sort(Collections.reverseOrder());
dsmiley marked this conversation as resolved.
Show resolved Hide resolved
}

int i = 0;
if (afterCoreReload) {
log.info("Will not remove most recent old directory after reload {}", oldIndexDirs[0]);
log.info("Will not remove most recent old directory after reload {}", dirsList.getFirst());
i = 1;
}

log.info(
"Found {} old index directories to clean-up under {} afterReload={}",
oldIndexDirs.length - i,
dirsList.size() - i,
dataDirPath,
afterCoreReload);
for (; i < dirsList.size(); i++) {
File dir = dirsList.get(i);
String dirToRmPath = dir.getAbsolutePath();
try {
if (deleteOldIndexDirectory(dirToRmPath)) {
log.info("Deleted old index directory: {}", dirToRmPath);
} else {
log.warn("Delete old index directory {} failed.", dirToRmPath);
}
} catch (IOException ioExc) {
log.error("Failed to delete old directory {} due to: ", dir.getAbsolutePath(), ioExc);
}
}

dirsList.stream()
.skip(i)
.forEach(
mlbiscoc marked this conversation as resolved.
Show resolved Hide resolved
(entry) -> {
String dirToRmPath = entry.toAbsolutePath().toString();
try {
if (deleteOldIndexDirectory(dirToRmPath)) {
log.info("Deleted old index directory: {}", dirToRmPath);
} else {
log.warn("Delete old index directory {} failed.", dirToRmPath);
}
} catch (IOException ioExc) {
log.error(
"Failed to delete old directory {} due to: ", entry.toAbsolutePath(), ioExc);
}
});
}

// Extension point to allow subclasses to infuse additional code when deleting old index
Expand Down
35 changes: 24 additions & 11 deletions solr/core/src/java/org/apache/solr/core/SolrCore.java
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@
import com.codahale.metrics.Counter;
import com.codahale.metrics.Timer;
import java.io.Closeable;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
Expand All @@ -33,6 +32,7 @@
import java.lang.invoke.MethodHandles;
import java.lang.reflect.Constructor;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.NoSuchFileException;
import java.nio.file.Path;
import java.nio.file.Paths;
Expand Down Expand Up @@ -1104,9 +1104,6 @@ protected SolrCore(
checkVersionFieldExistsInSchema(schema, coreDescriptor);
setLatestSchema(schema);

// initialize core metrics
initializeMetrics(solrMetricsContext, null);

// init pluggable circuit breakers, after metrics because some circuit breakers use metrics
initPlugins(null, CircuitBreaker.class);

Expand All @@ -1124,6 +1121,9 @@ protected SolrCore(
this.snapshotMgr = initSnapshotMetaDataManager();
this.solrDelPolicy = initDeletionPolicy(delPolicy);

// initialize core metrics
initializeMetrics(solrMetricsContext, null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you move this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a mistake. I forgot to revert. This has to do with metrics being pre-computed with File disk usage and was trying to get the data directory initialized before the metrics registered. I eventually stopped and forgot to move it back


this.codec = initCodec(solrConfig, this.schema);
initIndex(prev != null, reload);

Expand Down Expand Up @@ -1354,16 +1354,29 @@ public void initializeMetrics(SolrMetricsContext parentContext, String scope) {
Category.CORE.toString());
}

// TODO SOLR-8282 move to PATH
// initialize disk total / free metrics
Path dataDirPath = Paths.get(dataDir);
File dataDirFile = dataDirPath.toFile();
parentContext.gauge(
() -> dataDirFile.getTotalSpace(), true, "totalSpace", Category.CORE.toString(), "fs");
Path dataDirFile = Paths.get(dataDir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really minor but I think dataDirPath is a better name

long totalSpace;
long useableSpace;

try {
totalSpace = Files.getFileStore(dataDirFile).getTotalSpace();
useableSpace = Files.getFileStore(dataDirFile).getUsableSpace();
} catch (Exception e) {
// TODO Metrics used to default to 0 with java.io.File even if directory didn't exist
// Should throw an exception and initialize data directory before metrics
totalSpace = 0L;
useableSpace = 0L;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This here was questionable. Upon a Solr core being initialized, metrics was trying to calculate and initialize a gauge for the data/ directories before the directory was ever created. This always failed with File but it defaults to 0. Path on the other hand throws an Exception. This didn't seem right but also moving where the initializeMetrics is called in a bunch of different places had tests failing and I wasn't sure why so I kept it as TODO.

But also if this is initializing metrics for a SolrCore registry upon creation, shouldn't it always be 0 as the data directory was most likely just created?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not pre-compute this stuff; it should only be calculated/fetched when the metric is fetched. Maybe could clarify that the TODO is to move metrics initialization to after data directory initialization, thus avoiding this exception (hopefully) but that may not be entirely possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just initialized the gauges to 0 and added the TODO. Could be worth moving metric initialization to after the directory is created if we want that data point of the disk usage.


final long finalTotalSpace = totalSpace;
final long finalUsableSpace = useableSpace;
parentContext.gauge(() -> finalTotalSpace, true, "totalSpace", Category.CORE.toString(), "fs");
parentContext.gauge(
() -> dataDirFile.getUsableSpace(), true, "usableSpace", Category.CORE.toString(), "fs");
() -> finalUsableSpace, true, "usableSpace", Category.CORE.toString(), "fs");

parentContext.gauge(
() -> dataDirPath.toAbsolutePath().toString(),
() -> dataDirFile.toAbsolutePath().toString(),
true,
"path",
Category.CORE.toString(),
Expand Down
5 changes: 1 addition & 4 deletions solr/core/src/java/org/apache/solr/core/SolrPaths.java
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@

package org.apache.solr.core;

import java.io.File;
import java.lang.invoke.MethodHandles;
import java.nio.file.Path;
import java.nio.file.Paths;
Expand All @@ -43,9 +42,7 @@ private SolrPaths() {} // don't create this

/** Ensures a directory name always ends with a '/'. */
public static String normalizeDir(String path) {
return (path != null && (!(path.endsWith("/") || path.endsWith("\\"))))
? path + File.separator
: path;
return (path != null && (!(path.endsWith("/") || path.endsWith("\\")))) ? path + "/" : path;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow that's gnarly.. Is there no existing normalize() function in any of our jars that would do this for us? I guess fixing that is orthongonal to this effort....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code was fine as it was but perhaps got your attention because File is referenced. Since we only need to know the separator, try FileSystems.getDefault().getSeparator()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking, we shouldn't need to call FileSystems.getDefault().getSeparator() assuming callers of this eventually take this string and put it into a Path, but I don't think that is necessarily true. Seeing something like this line it's comparing strings 1:1 but my slash would probably break it on a different OS that uses a different delimiter. Directory strings should probably be checked through Path.equal(), not strings directly. I just put it back with FileSystems.getDefault().getSeparator()

}

/**
Expand Down
21 changes: 11 additions & 10 deletions solr/core/src/java/org/apache/solr/core/SolrResourceLoader.java
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@

import com.google.common.annotations.VisibleForTesting;
import java.io.Closeable;
import java.io.File;
import java.io.FilterInputStream;
import java.io.IOException;
import java.io.InputStream;
Expand All @@ -34,6 +33,7 @@
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.PathMatcher;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
Expand Down Expand Up @@ -353,12 +353,14 @@ public ClassLoader getClassLoader() {
*/
@Override
public InputStream openResource(String resource) throws IOException {
if (resource.trim().startsWith("\\\\")) { // Always disallow UNC paths
throw new SolrResourceNotFoundException("Resource '" + resource + "' could not be loaded.");
String pathResource = Paths.get(resource).normalize().toString();
if (pathResource.trim().startsWith("\\\\")) { // Always disallow UNC paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw we have a SolrPaths.assertNotUnc(dir);, could it be used here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why trim after we've called normalize? If trimming had been first thing; probably needs to continue to be first thing. I don't like trim in most places including here though (it's generally sloppy); in a major version we can get away with dropping it.
+1 to Eric's comment. And no need to call normalize at this spot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the find, Eric. I don't know why I overcomplicated this line with the normalize. Just passing the resource here instead so not much should have changed.

Also switched to assertNotUnc but the function checked both the path and the systems OS. I made a tiny modification to it which I am not sure if we should do but a test that was asserting the message was modified as well.

throw new SolrResourceNotFoundException(
"Resource '" + pathResource + "' could not be loaded.");
}
Path instanceDir = getInstancePath().normalize();
Path inInstanceDir = getInstancePath().resolve(resource).normalize();
Path inConfigDir = instanceDir.resolve("conf").resolve(resource).normalize();
Path inInstanceDir = getInstancePath().resolve(pathResource).normalize();
Path inConfigDir = instanceDir.resolve("conf").resolve(pathResource).normalize();
if (allowUnsafeResourceloading || inInstanceDir.startsWith(instanceDir)) {
// The resource is either inside instance dir or we allow unsafe loading, so allow testing if
// file exists
Expand All @@ -373,17 +375,17 @@ public InputStream openResource(String resource) throws IOException {

// Delegate to the class loader (looking into $INSTANCE_DIR/lib jars).
// We need a ClassLoader-compatible (forward-slashes) path here!
InputStream is = classLoader.getResourceAsStream(resource.replace(File.separatorChar, '/'));
InputStream is = classLoader.getResourceAsStream(pathResource);

// This is a hack just for tests (it is not done in ZKResourceLoader)!
// TODO can we nuke this?
if (is == null && System.getProperty("jetty.testMode") != null) {
is = classLoader.getResourceAsStream(("conf/" + resource).replace(File.separatorChar, '/'));
is = classLoader.getResourceAsStream(Path.of("conf").resolve(pathResource).toString());
}

if (is == null) {
throw new SolrResourceNotFoundException(
"Can't find resource '" + resource + "' in classpath or '" + instanceDir + "'");
"Can't find resource '" + pathResource + "' in classpath or '" + instanceDir + "'");
}
return is;
}
Expand All @@ -404,8 +406,7 @@ public String resourceLocation(String resource) {
return inInstanceDir.normalize().toString();
}

try (InputStream is =
classLoader.getResourceAsStream(resource.replace(File.separatorChar, '/'))) {
try (InputStream is = classLoader.getResourceAsStream(Path.of(resource).toString())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, wouldn't this actually convert to the platform's separator char?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct, bad that I removed the replace. Brought it back but changed to FileSystems.getDefault().getSeparator()

if (is != null) return "classpath:" + resource;
} catch (IOException e) {
// ignore
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,14 @@
*/
package org.apache.solr.core.backup;

import java.io.File;
import java.io.IOException;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.io.Writer;
import java.lang.invoke.MethodHandles;
import java.net.URI;
import java.nio.charset.StandardCharsets;
import java.nio.file.Path;
import java.time.Instant;
import java.util.Collections;
import java.util.List;
Expand Down Expand Up @@ -343,7 +343,7 @@ private void downloadConfigToRepo(ConfigSetService configSetService, String conf
// getAllConfigFiles always separates file paths with '/'
for (String filePath : filePaths) {
// Replace '/' to ensure that propre file is resolved for writing.
URI uri = repository.resolve(dir, filePath.replace('/', File.separatorChar));
URI uri = repository.resolve(dir, Path.of(filePath).toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could repository.resolve() take a path instead of a string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No Eric; the BackupRepository abstraction is based around URI & String. We could change it (maybe that's your proposal) but that's a big change that deserves its own discussion/issue. I'd rather here see the existing code or something very close to it.

// checking for '/' is correct for a directory since ConfigSetService#getAllConfigFiles
// always separates file paths with '/'
if (!filePath.endsWith("/")) {
Expand Down
Loading
Loading