Skip to content

Fix: Allow pasting PDF URLs into main table to create entries #12911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 14, 2025

Conversation

Kaan0029
Copy link
Contributor

@Kaan0029 Kaan0029 commented Apr 8, 2025

Previously, pasting a URL that ended in .pdf into the main table showed an error: “Could not find suitable import format.” This change improves usability by allowing users to paste PDF links directly (e.g. on Windows 10 via clicking into the main table and then pressing CTRL + V), and automatically adds a corresponding BibEntry with metadata and file attachment.

Closes https://github.com/JabRef/jabref-issue-melting-pot/issues/201

This change:

  • Adds support for detecting and handling .pdf URLs in ImportHandler#handleStringData
  • Downloads the PDF into the library’s file directory
  • Extracts metadata using PdfMergeMetadataImporter
  • Creates a new BibEntry and attaches the downloaded file

Manual Testing

This feature was manually tested:

  • Pasted a .pdf URL into the main table
  • Confirmed the file was downloaded into the database's file directory
  • Confirmed a new BibEntry was created with metadata and file linked
  • Confirmed fallback behavior: when no metadata is present, an empty entry with the file is created

No test case added because this feature relies on clipboard input and file download behavior, which are difficult to mock in unit tests.

Screenshots

Screenshot 2025-04-07 at 10 54 31 PM

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • [/] Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (if change is visible to the user)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

Copy link
Member

@subhramit subhramit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments on first look

@@ -399,6 +402,16 @@ public List<BibEntry> handleStringData(String data) throws FetcherException {
if ((data == null) || data.isEmpty()) {
return Collections.emptyList();
}
LOGGER.debug("Checking if URL is a PDF: {}", data);

if (org.jabref.logic.util.URLUtil.isURL(data) && data.toLowerCase().endsWith(".pdf")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use fully qualified package link instead of import?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I changed it.
Unhealthy reflex sometimes😅

@@ -399,6 +402,16 @@ public List<BibEntry> handleStringData(String data) throws FetcherException {
if ((data == null) || data.isEmpty()) {
return Collections.emptyList();
}
LOGGER.debug("Checking if URL is a PDF: {}", data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it would be OK. Maybe with "trace" level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the comments. Trace level seems suitable since it provides low level detail. In hindsight, debug feels too noisy (especially given how often handleStringData gets called)

I changed it to trace level

urlDownload.toFile(targetFile);
} catch (FetcherException fe) {
LOGGER.error("Error downloading PDF from URL", fe);
throw new IOException("Error downloading PDF", fe);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this have to be wrapped into an IOException?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it doesn't have to be–thanks for the catch! I changed it

entries.add(emptyEntry);
}
return entries;
} catch (Exception ex) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not catch generic exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to IO Exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted them–thank you for the reminder!

@Kaan0029 Kaan0029 force-pushed the fix-pdf-url-paste branch from 6c9ee9f to ef322c6 Compare April 10, 2025 01:25
@Kaan0029
Copy link
Contributor Author

Kaan0029 commented Apr 10, 2025

Could someone please create an issue for this PR? The issue was listed in a non-public repo.
Thanks in advance!

Edit: Sorry for the force push (ran into some sync issues with my local setup after a rebase)

@Siedlerchr
Copy link
Member

No need for rebasing. We squash all commits after merging the PR

}
}

private String deriveFileNameFromUrl(String url) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use FileUtil.getValidFileName

@Siedlerchr
Copy link
Member

Please link the melting pot issue, even if it's not visible for everyone. In the changelog you can create an entry and reference the PR numner

@Siedlerchr Siedlerchr added the status: changes-required Pull requests that are not yet complete label Apr 12, 2025
@Kaan0029
Copy link
Contributor Author

Please link the melting pot issue, even if it's not visible for everyone. In the changelog you can create an entry and reference the PR numner

Unfortunately, I didn't receive a melting pot link for this issue. I just received a problem description and proceeded with solving it. Do you want me to share that "description" with you?

@ThiloteE
Copy link
Member

ThiloteE commented Apr 13, 2025

If there is no issue, then it's fine. The description in the pull-request is good enough as is.

Siedlerchr
Siedlerchr previously approved these changes Apr 13, 2025
Copy link
Member

@Siedlerchr Siedlerchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works

@Siedlerchr Siedlerchr enabled auto-merge April 13, 2025 17:27
@Siedlerchr Siedlerchr added this pull request to the merge queue Apr 13, 2025
}
LOGGER.trace("Checking if URL is a PDF: {}", data);

if (URLUtil.isURL(data) && data.toLowerCase().endsWith(".pdf")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, can you extract somehow last element of the URL, so that it resembles a path.

And then use method FileUtil#isPDFFile (https://github.com/JabRef/jabref/blob/main/src/main/java/org/jabref/logic/util/io/FileUtil.java#L494)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If extraction of last elements would take too much code -- then it's okay to leave as is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented your suggestions.
This way added a few more lines, but I prefer it that way: more maintainable and cleaner.

@Siedlerchr Siedlerchr removed this pull request from the merge queue due to a manual request Apr 13, 2025
@Siedlerchr Siedlerchr added status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers and removed status: changes-required Pull requests that are not yet complete labels Apr 13, 2025
@jabref-machine
Copy link
Collaborator

Your pull request needs to link an issue.

To ease organizational workflows, please link this pull-request to the issue with syntax as described in https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue:

Linking a pull request to an issue using a keyword

You can link a pull request to an issue by using a supported keyword in the pull request's description or in a commit message. The pull request must be on the default branch.

  • close
  • closes
  • closed
  • fix
  • fixes
  • fixed
  • resolve
  • resolves
  • resolved

If you use a keyword to reference a pull request comment in another pull request, the pull requests will be linked. Merging the referencing pull request also closes the referenced pull request.

The syntax for closing keywords depends on whether the issue is in the same repository as the pull request.

Examples

  • Fixes #xyz links pull-request to issue. Merging the PR will close the issue.
  • Fixes https://github.com/JabRef/jabref/issues/xyz links pull-request to issue. Merging the PR will close the issue.
  • Fixes https://github.com/Koppor/jabref/issues/xyz links pull-request to issue. Merging the PR will close the issue.
  • Fixes [#xyz](https://github.com/JabRef/jabref/issues/xyz) links pull-request to issue. Merging the PR will NOT close the issue.

@Siedlerchr Siedlerchr added this pull request to the merge queue Apr 14, 2025
Merged via the queue into JabRef:main with commit efb055f Apr 14, 2025
1 check passed
krishnagjsForGit pushed a commit to krishnagjsForGit/jabref that referenced this pull request May 2, 2025
…#12911)

* Improve handling of PDF file import behavior

* Add support for handling PDF URLs when pasted

* Revert submodule changes

* Replace Collections.emptyList() with List.of() and improve exception handling

* Change log level from debug to trace for PDF URL

* Registered fix in changelog

* Used getValidFileName for helper method deriveFileNameFromUrl

* Refactor PDF URL detection using FileUtil.isPDFFile

* Move getFileNameFromURL helper method to URLUtil class
krishnagjsForGit pushed a commit to krishnagjsForGit/jabref that referenced this pull request May 2, 2025
…#12911)

* Improve handling of PDF file import behavior

* Add support for handling PDF URLs when pasted

* Revert submodule changes

* Replace Collections.emptyList() with List.of() and improve exception handling

* Change log level from debug to trace for PDF URL

* Registered fix in changelog

* Used getValidFileName for helper method deriveFileNameFromUrl

* Refactor PDF URL detection using FileUtil.isPDFFile

* Move getFileNameFromURL helper method to URLUtil class
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants