Should I (student) share my data with a researcher I don't know directly?

TL;DR - talk to your supervisor (which to be honest is the answer to about 90% of the questions on here, but hopefully this answer helps structure that discussion).

First of all, you might need to check whether you are even in a position to make this decision. The ownership of intellectual property arising from student projects can be a complex subject and you might want to check with your Knowledge Exchange and Commercialisation office or equivalent before sending a large dataset to an external partner. In most cases this is likely to be a formality and there will be no issues unless the dataset might be used commercially, but they may want to put a simple Material Transfer Agreement in place to define how the recipient can use or further distribute the dataset.

Assuming it's up to you, essentially you have three options:

  1. give them the data without restriction. Best-case scenario: co-authorship, worst-case: nothing.
  2. refuse to give them the data. Best case scenario: nothing. Worst-case scenario: you and your supervisor look bad and you damage a potential future relationship with a collaborator or employer.
  3. give them the data after taking some actions to increase the likelihood of acknowledgement. There are three real options here:

3a) (mentioned by @gvgramazio) upload the data to a public repository. The preferred repository varies by field but I'd use an academic data repository like Figshare. You get a citeable DOI but this is not a publication, so if you're looking for publications, this doesn't help (it should, but that's another discussion).

3b) publish the dataset in a data journal like Nature Scientific Data; then everyone can use it but has to acknowledge you by citing the source.

3c) set up a material transfer agreement that defines how the recipient can use the data and what you get in return, which can formally include the option of coauthorship in some form if a manuscript is produced.

With option 1, the most likely outcome depends on their character. Are they a good person to collaborate with? Do they have a good reputation as a collaborator? This is probably the most important question, since if they're a good person to work with then I honestly can't see them not including you as a co-author anyway in this situation, although this could vary by field. Another, related consideration is whether might you potentially be working with (or for) them in the future, and if so if this might increase your visibility or chance of a job. If you don't know them, the only real way to assess this is by asking your supervisor, who knows them and also has an interest in supporting you.

Option 2 - refusing outright - really isn't likely to end well.

Option (3a) - a data repository - guarantees acknowledgement but not necessarily in a form that's useful to you, and option (3b) - a data paper - is the most work and guarantees acknowledgement in a form that's likely to be useful to you, but both (3a) and (3b) mean you're less likely to get a co-authorship.

Option (3c) is trickier. In my field MTAs are increasingly common but also very unpopular with some researchers. As the provider of the data, having it covered by an MTA is the safest option but negotiating one can be intimidating. Fortunately, most institutions have offices to handle this sort of thing.

You also need to consider a few other questions:

  • how helpful are the different possible outcomes going to be for your career? In my field you are assessed primarily on publications in peer-reviewed journals so 3b is preferable to 3a. However, if the friend is able to acknowledge the data by citing it, they are less likely to include you as a co-author unless you make further substantial contributions to the analysis (which they don't have to give you the opportunity to do).
  • Does their proposed analysis sound exciting? If it has a good chance of resulting in a really, really good paper and relies heavily on your dataset, then a chance of a co-authorship on a really high impact paper might be better than a lead authorship on a data-only paper.

Ultimately, most of these questions will be things your supervisor is able to help you with. So I'd arrange to talk to your supervisor, explain your concerns, and ask which option they suggest.