◐ Shell
clean mode source ↗

src: don't abort when package.json is a directory by bnoordhuis · Pull Request #18270 · nodejs/node

@nodejs-github-bot nodejs-github-bot added c++

Issues and PRs that require attention from people who are familiar with C++.

fs

Issues and PRs related to the fs subsystem / file system.

labels

Jan 20, 2018

richardlau

bmeck


CHECK_GE(numchars, 0);
if (numchars < 0)
return;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please return the empty string so that it gets put into packageMainCache

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't sound right. '' is for a package.json without a main property, see the bottom of this function.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm look at

return packageMainCache[requestPath] = undefined;

which eagerly returns with a falsey value and sets an entry.

then bails out. this also looks like it might have a bug re-reading it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that's for when we found (and read) a package.json. That's not the case here, it's closer to an ENOENT or EPERM condition.

edit: to be clear, that's the if (json === undefined) return false a few lines up.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, we found a package.json it was a directory though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, not a file we can parse.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of done arguing but I'll address your last point: by your logic errors such as EPERM should also be interpreted as meaning "package.json without a main entry" - but we don't do that and rightly so.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EPERM is different, you can EPERM a directory. We can inspect this entity on the file system.

@bmeck

merging to my PR. would like to see more cooperation towards mentorship in future.

@bnoordhuis

@bmeck

@bnoordhuis I disagree on landing it, but would like to know if your block required the difference in our opinions.

bmeck

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explanation of landing difference

addaleax

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as this is.

Given that I’d expect directories called package.json to almost never show up in the real world, I’m okay with tending towards semantical correctness and not entering the entry into the cache.

(I’d also be okay with the other approach.)

@bmeck

@addaleax falsey values put into the cache don't even work today, hence my need of rereview of the "parent" PR. I still think semantics are closer to a file with empty content than a permissions error since we can inspect the entity on disk.

@addaleax

@bmeck I disagree with your last statement.

Also, is there any practical difference?

@bmeck

@addaleax

I disagree with your last statement.

Then we are at an impasse. I feel rather strongly one way, and others feel another way.

Also, is there any practical difference?

If it gets put into the cache or not, so C++ calls back into the function / stats. Likelyhood of this situation occurring is low but seems relevant still. Of note we also return "" for packages without main fields at all rather than treat it as an error.

@bnoordhuis

would like to know if your block required the difference in our opinions

I don't understand the question. What block?

@bmeck

@bnoordhuis

so I can move forward with work that depends on it.

@bnoordhuis

Still not sure I'm following... the work I'm referring to is unrelated to the module system - I want to factor out file reading in C++.

@bmeck

So it can move forward without this PR?

@bnoordhuis

No, because I want to repurpose code that this PR touches.

@bmeck

@bnoordhuis

If it is not dependent on the module system, but needs some code this PR touches, Would your refactor be any different, due to making the InternalModuleReadJSON return "" as I requested above?

As an alternative I'd also be fine removing "" from being returned on missing "main" in package.json, since it would never be hit by the cache check anyway without a fix.

The consistency between the two is my real issue with not returning "" so as long as it is one way or another.

@bmeck

I guess both PRs can go to tsc-review if we are stuck as well.

guybedford

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good to me, the function is returning the package.json contents, so an empty return corresponding to no package.json file seems sensible in terms of the function's operation.

That said, the lack of a package.json can be cached just like the package.json itself, which doesn't currently seem to be the case. Having this scenario register into the cache may be a worthwhile followup PR especially with the ESM loader package.json checks which we should be getting to share this cache as well.

@bmeck

Cache of all package.json situations seems fine to me, but my concern is we don't have a clear definition with what it means to be an error. We have:

  1. some error behavior that doesn't get cached for things like ENOENT and EPERM
  2. some defaulting behavior with the missing "main" string in the file contents in which we attempt to cache as a permanent "error"?
  3. we don't cache JSON.parse errors, and we even throw on them
  4. we don't actually check if the JSON itself has a "main" property using getOwnPropertyDescriptor or anything, but we do cache the result.
    • you can have odd situations here as well with things like 0 as the package.json contents.

I'm trying to figure out how to make this more internally consistent:

1 doesn't cache errors, but doesn't treat it as having a value.
2 caches the error with a default value.
3 doesn't cache errors, but does throw.
4 just seems like a bug

I would like to choose one of 1, 2, 3, or caching the error and throwing as a consistent behavior with a personal preference on always caching the results. I think this preference is from working with ESM to a small extent that by spec caches errors permanently when importing other modules.

@bnoordhuis

@bmeck Can you retract your red X? It's fine if you want to discuss semantics but please open a new issue. It shouldn't hold up a bug fix.

@bmeck

@bnoordhuis we have a disagreement, so no. Per stopping fixes, you also keep a red X on my PR. Your choice to open a new PR and not contribute to the original makes this rather hard as well. Maybe keeping things in one place would have been better, but now we have 2 PRs that neither of us likes apparently. One that does acknowledge both contributors involved and solved the review comments. You can remove your X on mine or take one of the 2 suggestions above we can land either PR.

In the future, I highly recommend a different approach to superceding PRs be looked into as your behavior of open a PR that only you commit to ontop of mine has happened a couple times and this scenario seems to be common that you give me a red X and tell me that I am blocking progress or have even insulted me in the past.

@bnoordhuis

Well, okay then. @nodejs/tsc Please weigh in.

@targos

I think we should treat this the same as ENOENT here and make all of this more consistent in a subsequent PR.

This was referenced

Apr 7, 2024

This was referenced

Apr 7, 2024

This was referenced

Apr 7, 2024

This was referenced

Apr 7, 2024

This was referenced

Apr 8, 2024

This was referenced

Oct 21, 2024