What's the major difference between glove and word2vec?

Yes, they're both ways to train a word embedding. They both provide the same core output: one vector per word, with the vectors in a useful arrangement. That is, the vectors' relative distances/directions roughly correspond with human ideas of overall word relatedness, and even relatedness along certain salient semantic dimensions.

Word2Vec does incremental, 'sparse' training of a neural network, by repeatedly iterating over a training corpus.

GloVe works to fit vectors to model a giant word co-occurrence matrix built from the corpus.

Working from the same corpus, creating word-vectors of the same dimensionality, and devoting the same attention to meta-optimizations, the quality of their resulting word-vectors will be roughly similar. (When I've seen someone confidently claim one or the other is definitely better, they've often compared some tweaked/best-case use of one algorithm against some rough/arbitrary defaults of the other.)

I'm more familiar with Word2Vec, and my impression is that Word2Vec's training better scales to larger vocabularies, and has more tweakable settings that, if you have the time, might allow tuning your own trained word-vectors more to your specific application. (For example, using a small-versus-large window parameter can have a strong effect on whether a word's nearest-neighbors are 'drop-in replacement words' or more generally words-used-in-the-same-topics. Different downstream applications may prefer word-vectors that skew one way or the other.)

Conversely, some proponents of GLoVe tout that it does fairly well without needing metaparameter optimization.

You probably wouldn't use both, unless comparing them against each other, because they play the same role for any downstream applications of word-vectors.